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In this paper and a companion paper, we attempt to systematically investigate the possibility 
that the concept of information may enable a derivation of the quantum formalism from a set of 
physically comprehensible postulates. To do so, we formulate an abstract experimental set-up and 
a set of assumptions based on generalizations of experimental facts that can be reasonably taken to 
be representative of quantum phenomena, and on theoretical ideas and principles, and show that it 
is possible to deduce the quantum formalism. In particular, we show that it is possible to derive 
the abstract quantum formalism for finite-dimensional quantum systems and the formal relations, 
such as the canonical commutation relationships and Dirac's Poisson Bracket rule, that are needed 
to apply the abstract formalism to particular systems of interest. The concept of information, via 
an information-theoretic invariance principle, plays a key role in the derivation, and gives rise to 
some of the central structural features of the quantum formalism. 



I. INTRODUCTION 

Over the last two decades, a number of authors have 
expressed the view that our efforts to develop an under- 
standing of quantum theory are impeded by a lack of un- 
derstanding of the physical origin of the quantum formal- 
ism, and that our efforts would thereby be significantly 
aided by a systematic derivation of the formalism from 
a set of physically comprehensible assumptions 0) Q ■ 
Furthermore, several authors have proposed that the con- 
cept of information may be the key, hitherto missing, in- 
gredient which, if appropriately applied and formalized, 
might make such a derivation possible P, H, 0, H, @ . 

The proposal that information might enable a deriva- 
tion of quantum formalism rests, to a significant degree, 
upon the recognition that the concept of information 
plays a new and fundamental role in quantum physics. 
One way to see this is as follows. In classical physics, 
an experimenter presented with a system in an unknown 
state can, in principle, perform an ideal measurement 
upon the system which gives perfect knowledge about 
the state of the system. Hence, there is no fundamental 
distinction between the state and an ideal experimenter's 
knowledge of the state. In quantum physics, however, an 
ideal measurement (or even a finite number of such mea- 
surements performed upon an ensemble of identically- 
prepared systems) provides only partial knowledge about 
the unknown state of a quantum system. Hence, in sharp 
contrast to the situation in classical physics, a fundamen- 
tal distinction is drawn between the state and the knowl- 
edge that the experimenter can conceivably have of it. 
The concept of information then immediately assumes a 
fundamental role through the natural attempt to quanti- 
tatively relate the two: 'How much information has been 
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obtained by the experimenter about the state?' 

One of the earliest attempts to explore the role of in- 
formation is due to Wootters Suppose that Alice 
has a Stern-Gerlach apparatus oriented at angle {0,4)), 
and attempts to communicate the angle 9 to Bob using 
spin-1/2 particles as follows. Alice prepares n spin-1/2 
particles in the state using her Stern-Gerlach ap- 

paratus, and sends the particles to Bob, who measures 
them using a vertically-aligned Stern-Gerlach apparatus. 
The data he obtains provides information about the out- 
come probabilities. Pi, P2, of the measurement, where Pi 
is the probability of a spin emerging in the positive chan- 
nel. Since, from quantum theory, Pi = cos^(6'/2). Bob 
thereby gains information about 9. However, we can now 
ask the question: suppose we did not know quantum the- 
ory, and instead simply regard the experimental arrange- 
ment as a way for Bob to learn about 9 by observing the 
frequencies of the two possible outcomes of his Stern- 
Gerlach apparatus; what function Pi{9) would maximize 
the amount of information obtained by Bob about 9 for 
given n? Wootters finds that, if the information is quan- 
tified using the Shannon information measure, then, in 
the limit as n ^ 00, the function is Pi{9) — cos^(m6'/2), 
where m e Z+, a generalized form of Mains' law, which 
includes the correct result as a special case. 

Wootters' result is remarkable since it shows that, us- 
ing the standard inferential methods of probability the- 
ory and the well-established Shannon information mea- 
sure, and taking an operational approach that assumes 
the probabilistic nature of measurement outcomes, it is 
possible to make a correct, non-trivial physical predic- 
tion concerning a quantum experiment from a plausible 
information-theoretic principle. However, Wootters' at- 
tempt to generalize this result in the direction of the 
quantum formalism meets with limited success. 

More recently, other attempts 0, H, H, have been 
made to examine and quantify the gain of information 
in the measurement process, and which differ in various 
ways from Wootters' approach, but which are also able 
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to derive the generalized form of Malus' law. However, as 
with Wootters' approach, they are unable to generalize 
their results to obtain a significant part of the quantum 
formalism. 

In contrast, several other recent approaches [ll.fTllIT^. 
[Tsl . O [lH which involve the concept of information suc- 
ceed in deriving a significant fraction of the quantum 
formalism. However, at the outset, these approaches 
make abstract assumptions of key importance which arc 
given no physical interpretation, and which detract from 
the understanding of the physical origin of the quan- 
tum formalism that can thereby be obtained. For in- 
stance, in the approach described in it is shown 
that, provided one assumes that a complex number is 
associated with each suitably-defined experimental set- 
up, Feynman's rules [l6l] for combining complex proba- 
bility amplitudes can be derived from a set of plausible 
consistency conditions. However, the choice of number 
field is not given a physical interpretation, and an alter- 
native choice of field, such as the reals or quaternions, 
would lead to a different set of rules. In the approaches 
described in [isl . [lH , a similar choice regarding the ap- 
plicable number field is made at the outset [43| . 

In this paper and a companion paper [l7| (hereafter 
referred to as Paper II), we attempt to build upon the 
insights provided by Wootters' approach, and formulate 
an information-theoretic principle and a set of physically 
comprehensible assumptions from which it is possible to 
derive the standard formalism of quantum theory. In par- 
ticular, we obtain the finite-dimensional abstract quan- 
tum formalism, namely (a) the von Neumann postulates 
for finite-dimensional systems, (b) the tensor product 
rule for expressing the state of a composite system in 
terms of the states of its sub-systems, and (c) the result 
due to Wigner that any symmetry transformation of a 
quantum system can be represented by a unitary or an- 
tiunitary transformation |18j . In addition, we obtain the 
formal rules of quantum theory [4^ , such as the canoni- 
cal commutation relations, which are necessary to apply 
the abstract formalism to obtain concrete models of par- 
ticular experimental set-ups. We proceed as follows. 

First, in Sec. Ill Al we describe an idealized, abstract 
experimental set-up, which provides a general framework 
within which particular experimental set-ups can be de- 
scribed. The preparations, interactions, and measure- 
ments that are permitted in a given set-up are defined 
in an operational manner. This makes it possible to op- 
erationally specify set-ups, where, like those set-ups or- 
dinarily considered in quantum theory, the preparation 
provides the maximum possible control over the system 
insofar as predictions about the outcome probabilities 
of the measurement are concerned, and the interactions 
only affect the degrees of freedom of the state of the sys- 
tem that are under control of the preparation. 

Second, in Sec. IIIBl we present a set of postulates 
which concern the behavior of measurements performed 
on the system, and which determine the theoretical rep- 
resentation of measurements, the state of the system, and 



physical transformations of the system. The postulates 
are formulated so as to be physically comprehensible, 
and an analysis of their comprehensibility is presented 
in Sec. IIIII The key postulate is the Principle of In- 
formation Gain, which expresses the idea that, although 
different measurements yield different information about 
the state of a system, they nonetheless provide the same 
amount of information about the state. That is, although 
different measurements provide different perspectives on 
a system, none is informationally privileged with respect 
to any other. 

Third, in Sec. IIVI we show that, within the framework 
provided by the abstract set-up, these postulates are suf- 
ficient to derive the finite-dimensional abstract quantum 
formalism, apart from the form of the temporal evolu- 
tion operator. In Paper II, we formulate an additional 
principle, the Average- Value Correspondence Principle, 
with which we obtain the form of the temporal evolution 
operator and the formal rules of quantum theory. 

In the course of the derivation, we find that the con- 
cept of information, via the principle of information gain, 
gives rise to a number of the key features of the quan- 
tum formalism, such as the importance of square-roots of 
probability (real amplitudes) and the sinusoidal variation 
of probability with parameters, and plays a key role in 
the restriction of possible transformations of state space 
to unitary and antiunitary transformations. 

We conclude in Sec. |V] with a discussion of the results. 



II. EXPERIMENTAL SET-UP AND 
POSTULATES 

In this section, we shall first present an idealized, 
abstract experimental set-up, which provides a general 
framework within which particular experimental set-ups 
can be described. We shall then state a set of postulates 
which determines the abstract theoretical model of the 
abstract experimental set-up. 

A. Abstract Experimental Set-up 

1 . Introduction 

The description of an experimental set-up in a manner 
sufficiently precise to enable modeling using the quantum 
formalism involves the use of terms that are particular to 
the abstract language of the quantum formalism. For ex- 
ample, one speaks of a set-up that prepares a system in a 
pure state, but the concept of a pure state has a special- 
ized meaning which presupposes the quantum formalism. 
However, since our goal is to derive the formalism, our 
first task is to devise a way of defining, with sufficient 
precision, what constitutes an experimental set-up with- 
out making reference to such terms. 

At the outset, we shall adopt, as background assump- 
tions, the following idealizations drawn from classical 
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physics: 

(a) Partitioning. The universe is partitioned into a 
system, the background environment (or simply, 
the background) 45] of the system, measuring ap- 
paratuses, and the rest of the universe. 

(b) Time. In a given frame of reference, one can speak 
of a physical time which is common to the system 
and its background, and which is represented by a 
real-valued parameter, t. 

(c) States. At any time, the system is in a definite 
physical state, whose mathematical description is 
called the mathematical state, or simply the state, 
of the system. The state space of the system is the 
set of all possible states of the system. 

The general abstract experimental set-up that we shall 
consider is shown in Fig. [TJ A source provides identical 
copies of a physical system of interest. A preparation 
step either selects or rejects the incoming system. In a 
particular run of the experiment, a physical system from 
the source passes the preparation, and is then subject to 
a measurement or measurements. In addition, following 
the preparation, the system may undergo an interaction 
with a physical apparatus. 

We shall only consider set-ups which satisfy particular 
idealizations. In particular, we shall restrict considera- 
tion to measurements that have the following properties: 

(i) Finiteness: the measurements yield a finite number 
of possible outcomes, 

(ii) Distinctness: the possible outcomes of a measure- 
ment have distinct values, 

(iii) Repetition Consistency: when a measurement is im- 
mediately repeated, the same outcome is observed 
with certainty, and 

(iv) Classicality: the measurements do not involve aux- 
iliary quantum systems. 

In addition, we shall assume that interactions have the 
following properties: 

(i) Identity-preserving: the interactions preserve the 
identity of the system, and 

(ii) Reversible and deterministic: the interactions are 
reversible and deterministic at the level of the state 
of the system, and so can be represented as one-to- 
one maps over state space. 

We shall also assume that the background of the 
system can be adequately modeled within the classical 
framework insofar as its internal dynamics is concerned. 
For example, in the case of a system in a background elec- 
tromagnetic field, the field is assumed to be modeled clas- 
sically. Similarly, we shall assume that parameters which 



determine the measurement being performed (the orien- 
tation of a Stern-Gerlach apparatus, for instance) are de- 
scribed classically as real- valued numbers. In short, it is 
assumed that the non-classicality is entirely concentrated 
in the system and in its interactions with the background 
and the measurement devices. 



2. Completeness of a Preparation 

The essential purpose of the experimental set-up illus- 
trated in Fig. 1 is to allow some property of a physical 
system to be studied under controlled conditions |i46i] . 
Ideally, one would like to prepare the system such that, 
immediately following the preparation, one has as much 
knowledge as possible about the degrees of freedom of 
the state of the system that are relevant to the property 
under study, and one would like to interact with the sys- 
tem so that only these degrees of freedom are affected. 
For example, if one wishes to study the spin properties 
of a system, one would prepare the system so that its 
spin direction is fixed (in classical physics), or its state 
is pure (in quantum physics). Similarly, one would allow 
uniform B-field interactions since these only affect the 
spin degrees of freedom of the system, but non-uniform 
_B-field interactions would be excluded since they couple 
spin and spatial degrees of freedom, and since spatial de- 
grees of freedom are not under control of the preparation. 

Now, ordinarily, we rely upon a particular physical the- 
ory to tell us which preparations are maximal with re- 
spect to a given measurement in the sense that they pro- 
vide us with as much control as physically possible over 
the degrees of freedom of the state of the system that are 
relevant to predictions concerning the outcomes of the 
given measurement, and which interactions are compat- 
ible with the preparation and measurement in the sense 
of only affecting the degrees of freedom that are under 
control of the preparation. However, since our goal is to 
derive the abstract quantum formalism, where measure- 
ments and interactions are treated purely in the abstract, 
it is necessary to find a way to establish when a prepa- 
ration is maximal with respect to a given measurement, 
and when an interaction is compatible with a preparation 
and measurement, in a correspondingly abstract manner. 

To do so, we make use of the fact that, in both classi- 
cal and quantum physics, a preparation is maximal with 
respect to a given measurement if and only if the prepa- 
ration is complete in that it renders the history of the 
system prior to the preparation irrelevant insofar as pre- 
dictions concerning the measurement outcomes are con- 
cerned. For example, in classical physics, if a preparation 
places a system in a precisely known state (which is, in 
principle, possible), one has maximal degree of control 
over the state, and the results of subsequent measure- 
ments performed on the system are independent of the 
history of the system prior to the preparation, so that the 
preparation is also complete. The converse is also true. 

In quantum physics, one encounters a similar situation. 
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FIG. 1: An abstract, idealized experimental-set up. A physical system (such as a silver atom) is emitted from a source, passes 
a preparation step, and is then subject to a measurement. The preparation is implemented as a measurement. A', which 
has Na' possible outcomes, followed by the selection of those systems which yield some outcome j {j = 1,2,..., Nj^i). The 
measurement. A, has Na possible outcomes. The measurement detectors are assumed not to absorb the systems that they 
detect. An interaction, I, may occur as indicated between the preparation and measurement. 



For example, consider an experimental set-up where, in 
each run, a spin-1/2 system undergoes a preparation by 
a Stern-Gerlach measurement device, and subsequently 
undergoes a Stern-Gerlach measurement. From quan- 
tum theory, we know that the preparation in this case 
is maximal with respect to the subsequent Stern-Gerlach 
measurement, and we also know that the outcome prob- 
abilities of the measurement are independent of the pre- 
preparation history of the spin-1/2 system, so that the 
preparation is also complete. The converse is also true. 
More generally, if the preparation of a quantum system 
is maximal with respect to a given projective measure- 
ment, then we know from quantum theory that a system 
is prepared in a pure state, so that the preparation is 
also complete with respect to the measurement; and con- 
versely. 

Now, most importantly, unlike the notion of maximal- 
ity, it is straightforward to operationalize the notion of 
completeness: continuing with the example of the spin- 
1/2 experiment, if one models the data obtained from 
the measurement in n runs of the experiment using a 
probabilistic source [13], one finds that, in the limit of 
large n, the outcome probabilities of the source are in- 
dependent of arbitrary pre-preparation interactions j48j 
with the system. 

Using this operationally-defined notion of complete- 
ness as a basis, we shall see below that it is possible to 
give precise expression to the idea that, roughly speaking, 
a pair of measurements are examining the same property 
of the system from different perspectives, and that an 
interaction is only manipulating this particular property 
of the system. 



3. Definitions 

The measurements employed in the abstract set-up are 
chosen from a measurement set, A. As mentioned previ- 



ously, it will be assumed that each measurement has the 
property of finiteness, which we shall now operational- 
ize by saying that, when the measurement is carried out 
on a system which has been emitted from the source and 
has undergone arbitrary interactions thereafter, the mea- 
surement generates one of a finite number of possible out- 
comes, a possible outcome being defined as one that has 
a non-zero probability of occurrence. It will also be as- 
sumed that the measurement detectors do not absorb the 
systems that they detect. 

A preparation consists of a measurement that deter- 
mines to which outcome the incoming system belongs, 
followed by the selection of the system if the measure- 
ment registers a given outcome, and the rejection of the 
system otherwise. If detectors that do not absorb the de- 
tected systems are unavailable, a preparation can instead 
be implemented using a measurement where one of the 
detectors is removed. 

Consider now an experiment (Fig. [T]) in which a sys- 
tem from a source is subject to a preparation consisting 
of measurement. A', with Na' possible outcomes, with 
outcome j selected {j — 1, . . . , Na'), followed by mea- 
surement A (with Na possible outcomes), without an 
interaction in the intervening time. 

Suppose that the data obtained in n runs of the ex- 
periment are modeled by a probabilistic source with Na 
possible outcomes, whose most likely probabilities (cal- 
culated on the basis of the data) are given by P = 
(Pi, P2, ■ • ■ , Pna)^ where Pi is the probability of the ith 
outcome {i = 1,2,..., Na) [H. If, for all j, P is in- 
dependent of arbitrary pre-preparation interactions with 
the system in the limit of large n, the preparation will 
be said to be complete with respect to measurement A. 
If the completeness condition also holds true when A 
and A' are interchanged, then A and A' will be said to 
form a measurement pair. 

The set of measurements generated by A forms a mea- 
surement set. A, which is defined as the set of all mea- 



5 



surements that (i) form a measurement pair with A and 
that (ii) are not a composite of other measurements in A. 
An important corollary of this definition is that two mea- 
surement sets are either identical or disjoint. 

Interactions that occur after the preparation step are 
chosen from an interaction set, 2, which is defined as 
follows. Suppose that, in the experiment of Fig. [1] an 
interaction, I, occurs between the preparation and mea- 
surement. If, for all A, A' e A, the preparation remains 
complete with respect to the subsequent measurement, 
then I will be said to be compatible with A and the 
source. The set T is then defined as the set of all such 
compatible interactions. 

If there are two experimental set-ups, each with a 
source containing identical copies of the same physical 
system, with respective disjoint measurement sets, y^'^-* 
and A^^"^ , then the set-ups will be said to be disjoint. This 
makes precise the rough notion that the set-ups examine 
different aspects of the same physical system. 

4- An example 

To illustrate the above definitions, consider again the 
spin- 1/2 experiment, where silver atoms emerge from a 
source (an evaporator), pass through a Stern-Gerlach 
preparation device, undergo an interaction, and finally 
undergo a Stern-Gerlach measurement. In this case, 
the set. A, generated by any Stern-Gerlach measure- 
ment consists of all Stern-Gerlach measurements of the 
form Ae^0, where {9, cj)) is the orientation of the Stern- 
Gerlach device. However, measurements that are com- 
posed of two or more Stern-Gerlach measurements are 
excluded from A. 

Consider now an interaction, Ieii5,0B,t,At, consisting of 
a uniform _B-field acting during the interval [t, t + At] 
in some direction (Obt^'b)- If such an interaction oc- 
curs between the preparation and measurement, one finds 
that the completeness of the preparation with respect to 
the measurement is preserved; that is, the interaction 
is compatible with A and the system. Hence, all inter- 
actions in which a uniform magnetic field acts between 
the preparation and measurement are in the interaction 
set, X. However, interactions consisting of a non-uniform 
B-field do not preserve completeness (viewed from the 
quantum theoretic model, such interactions couple the 
spin and position degrees of freedom of the system), and 
are therefore excluded from X. 

Finally, to illustrate the concept of disjoint set-ups, 
consider a source which emits a system consisting of two 
distinguishable spin-1/2 particles on each run of an ex- 
periment, and consider two set-ups where the first set- 
up has a measurement set A^^"^ consisting of all possi- 
ble Stern-Gerlach measurements performed on one of the 
particles, and the second has a measurement set con- 
sisting of all possible Stern-Gerlach measurements per- 
formed on the other particle. In this case, the two mea- 
surement sets are disjoint. The set-ups themselves are 



accordingly said to be disjoint, which precisely expresses 
the notion that the two set-ups are examining distinct 
aspects of the same physical system. 



B. Statement of the Postulates. 

Consider the idealized experiment illustrated in Fig. [1] 
in which a system passes a preparation step that employs 
a measurement A' in measurement set A, undergoes an 
interaction, I in the interaction set I, and is then subject 
to a measurement. A, in A. The abstract theoretical 
model that describes this set-up satisfies the following 
postulates. 

1. Measurements 

1.1 Finite and Probabilistic outcomes. When any 
given measurement A e ^ is performed, one 
oi N {N > 2) possible outcomes are observed. 
The ith outcome is obtained with probabil- 
ity Pi (i = 1 , . . . , N) , where Pi is determined 
by the preparation, interactions, and measure- 
ment. 

1.2 Representation of Measurements. For any 
given pair of measurements A, A' S A, there 
exist interactions I, I' G T such that A' can, 
insofar as probabilities of the outcomes and in- 
sofar as the output states of the measurement 
are concerned, be represented by an arrange- 
ment where I is immediately followed by A 
which, in turn, is immediately followed by I'. 

2. States 

2.1 States. With respect to any given measure- 
ment A e ^, the state, S{t), of a quan- 
tum system at time t is given by {P, x) , 
where P — (Pi, P2, • ■ • , P/v) and where x — 
(xijX2, ■ • ■ ,Xn) is a set of N real degrees of 
freedom. 

2.2 Physical interpretation of the Xi- When mea- 
surement A G ^ is performed on a system in 
state S{t) and the outcome i is observed, there 
are additional outcomes that are objectively 
realized but unobserved: 

(i) one of two outcomes, labeled a and 6, 
which are obtained with respective prob- 
abilities Pa\i = Ql\, and Pb|,, = Q^i^, 

where Qa\i = f{Xt) and Qt\i = f{xi), 
where / is not a constant function and /, / 
have range [—1,1], and 

(ii) one of two possible outcomes, with values 
labeled + and — , which is determined by 
the sign of either Q^^^^ or depending 
upon whether a or b has been realized. 
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2.3 Information Gain. When measurement A € 
A is performed on a system in any given un- 
known state S{t), the amount of Shannon- 
Jaynes information provided by the observed 
outcomes and the outcomes a and b about S{t) 
in n runs of the experiment is independent 
of S(t) in the Umit as n ^ oo. 

2.4 Prior probabilities. The prior probabil- 
ity Pr(xj|I), where I is the background knowl- 
edge of the experimenter prior to performing 
the experiment, is uniform for i = 1, . . . , N. 

3. Transformations Any transformation of a pre- 
pared physical system, whether active (due to tem- 
poral evolution of the system), or passive (a sym- 
metry transformation due to a change of the frame 
of reference), is represented by a map, A4, over the 
state space, S, of the system. 

3.1 One-to-one. The map M is one-to-one. 

3.2 Invariance. The map M is such that, for any 
state S € S, the observed outcome proba- 
bilities, P{ , P2, . . . , P^, of measurement A G 
A performed upon a system in state S' = 
A^(S) are unaffected if, in any representa- 
tion, {P,x) = {Pi:Xi)j of the state S written 
down with respect to A, any arbitrary real 
constant, XO; is added to each of the Xi- 

3.3 Parameterized Transformations. If a physi- 
cal transformation is continuously dependent 
upon the real- valued parameter n-tuple tt, and 
is represented by the map M^, then A^^ is 
continuously dependent upon tt. If the physi- 
cal transformation is a continuous transforma- 
tion, then, for some value of tt, Ai-rr reduces 
to the identity. 

3.4 Temporal Evolution. The map, AI/.a/- which 
represents temporal evolution of a system in a 
time-independent background during the in- 
terval [t,t + At], is such that any state, S, 
represented as {Pi;xi), of definite energy E, 
whose observable degrees of freedom are time- 
independent, evolves to (-P/;Xi)) where P/ = 
Pi and Xi = Xi — -E'At/a, where a is a non-zero 
constant with the dimensions of action. 

4. Consistency The posterior probability distribu- 
tions over S that result from the following two pro- 
cesses coincide in the limit as n ^ 00: 

(i) inferring a posterior over S based upon the 

objectively realized outcomes when the mea- 
surement A G ^ is performed upon n copies 
of a system in state S, and then transforming 

the posterior using M, or 

(ii) inferring a posterior over S based upon the 
objectively realized outcomes when the mea- 
surement A e ^ is performed upon n copies 
of a system in state A1(S), 



The above postulates, together with the Average- Value 
Correspondence Principle (AVCP), which will be given 
in Paper II, suffice to determine the form of the abstract 
quantum model for the abstract set-up. From Postu- 
lates 1.1 and 1.3, it follows that, when any measurement 
in A is performed on the system, one of N possible out- 
comes is observed. Accordingly, we shall denote the ab- 
stract quantum model of such a set-up by q(iV). 

Finally, we shall need Postulates 5, below, in order to 
obtain a rule, which we shall refer to as the composite 
systems rule, for relating the quantum model of a com- 
posite system to the quantum models of its component 
systems: 

5. Composite Systems Suppose that a system ad- 
mits a quantum model with respect to the mea- 
surement set A^^"^ whose measurements have N^^"^ 
possible observable outcomes, and admits a quan- 
tum model with respect to measurement set A^^^ 
whose measurements have A^*^^^ possible observable 
outcomes, where the sets A^^'' and ^l^^^ are disjoint. 

Consider the quantum model of the system with re- 
spect to the measurement set A = A'^^^ x A^'^'' that 
contains all possible composite measurements con- 
sisting of a measurement from A^^^^ and a measure- 
ment from A^'^'' . If the states of the sub-systems 
are represented as (P/^^xf') = 1, 2, . . . , iV^^^) 
and (Pj^';xf' ) (j = 1, 2, . . . , A/'^^)), respectively, 
then the state of the composite system can be 
represented as {Pij] Xij) y where Pij = Pj^^^ P^^^ 
and Xij = X?'^ +xf ^- 

III. OVERVIEW OF THE POSTULATES 

Many of the postulates described above can be seen to 
follow from the quantum formalism, which provides some 
understanding of these postulates. Accordingly, we shall 
first point out the relations between these postulates and 
the quantum formalism. We shall then describe how the 
postulates can be physically understood. 

A. Postulates that follow from quantum theory 

Of the postulates enumerated above, all apart from 
Postulates 2.2, 2.3, 2.4 and 4 can be seen to follow from 
the quantum formalism. 

Consider the quantum theoretical model of the ab- 
stract experimental set-up. Since the measurements in 
measurement set A yield one of A'' possible distinct ob- 
servable outcomes, it follows that the state space of the 
quantum model is A^-dimensional. Furthermore, since a 
preparation (implemented using a measurement A' € A) 
is complete with respect to a measurement A G A, it 
follows that the system immediately following the prepa- 
ration step is in a pure state, v G C^. 
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According to the quantum formalism, measurement A 
can be represented by a Hermitian operator, A. With 
respect to this measurement, the ith component of v can 
be written as PiC^'l''-, where Pi is the outcome probabihty 
of outcome i, so that the state can be represented as 

V = (Pi, . . . ,PAr;0i, . . . ,0JV), (1) 

or [Pi] (pi) for short, which yields Postulates 1.1 and 2.1. 

In the quantum model, it is assumed that physical 
transformations are represented by unitary or antiuni- 
tary transformations of state space. Unitary and antiu- 
nitary transformations are one-to-one maps, which gives 
Postulates 3 and 3.1. To show Postulate 3.2, consider the 
transformation of ve^*"^° by the unitary operator U. The 
transformed vector is 

y^p-#oUv. (2) 

However, the outcome probabilities of any measurement 
performed on the system in state v' are independent of 
the overall phase of v'. Therefore, these outcome proba- 
bilities are unaffected if an arbitrary 0o S 1^ is added to 
the 0i, where v is represented as in Eq. ([T]). 

Postulate 3.3 is obtained in two parts. First, if a phys- 
ical transformation depends continuously upon a set of 
real-valued parameters, then it is represented by a uni- 
tary or antiunitary transformation whose degrees of free- 
dom also continuously depend upon these parameters. 
Second, continuous transformations are represented by 
unitary transformations. If a unitary transformation is 
a continuous function of a set of real- valued parameters, 
then it is possible that, for some values of these parame- 
ters, the unitary transformation reduces to the identity. 

From the unitary operator Ut(At) = exp{—iHtAt/h) 
for the evolution of a system during the interval [t, t + 
At] in a time-independent background, where Ht is the 
Hamiltonian operator at time t, it follows that a state v 
which is an eigenstate of Ht evolves into 

v' = e-'^^*/^, (3) 

where E is the energy of the state. In the representation 
of Eq. H]), the state (P^; (pi) evolves to (P^; (pi - EAt/h), 
and, since v and v' differ only by an overall phase, they 
are observationally indistinguishable, which gives Postu- 
late 3.4. 

To show Postulate 1.2, suppose that one wishes to rep- 
resent A' in terms of measurement A. Consider an ar- 
rangement consisting of a unitary transformation U im- 
mediately followed by measurement A, followed imme- 
diately, in turn, by U^. Suppose that measurements A 
and A' are represented by the operators A and A , re- 
spectively, where Av.^ = a^Vi and A'v^ — a[\i[. Then, if we 
choose 

u=E^^<^ (4) 

i 

this arrangement behaves precisely the same as measure- 
ment A' insofar as the probabilities of the observed out- 
comes and insofar as the corresponding output states are 



concerned. To see this, note that, if the input state to the 
arrangement is c[m[ (the c[ being complex constants, 
such that Yl,i = 1) the state U Yl,i ^i^'i = Z^i 4^4, and 
therefore measurement A yields outcome i with proba- 
bility |c^p and yields corresponding state up to an 
irrelevant overall phase. The final output state of the 
arrangement is therefore U^Vi = m\. Hence, the arrange- 
ment behaves precisely as would measurement A' per- 
formed directly on a system in state v' in respect of the 
probabilities of observed outcomes 1, 2, . . . , TV and in re- 
spect of the output states. 

Finally, by considering the tensor product v = v^^^ ® 
v(2) where v^^^ e C^i and v^^) g C^^ are the states of 
two sub-systems, and v G C^, with N ~ N1N2, is the 
state of the composite system, one finds that Postulate 5 
follows at once. 



B. Physical Comprehensibility of the postulates. 

When formulating the postulates, our goal has been to 
maximize their physical comprehensibility. For the pur- 
poses of discussion, it is helpful to distinguish two levels 
of physical comprehensibility. First, at the minimum, 
a comprehensible postulate is one that can be transpar- 
ently understood as a simple assertion about the physical 
world. If this is the case, we shall say that the postulate 
has the property of transparency. Second, a postulate has 
an additional level of comprehensibility if it can also be 
traced to well-established experimental facts and physi- 
cal ideas or principles (traceability) . 

To illustrate these ideas, consider the example of Ein- 
stein's postulate of the constancy of the speed of light. 
The postulate can be transparently understood as the 
simple assertion that measurements of the speed of light 
in different inertial frames will yield the same result. In 
addition, the postulate can also be understood as a di- 
rect generalization of the well-established results of the 
Michelson-Morley experiment, the generalization being 
achieved by an appeal to the general principle of the uni- 
formity of nature. Hence, the postulate is both transpar- 
ent and traceable. 

Since the assumptions underlying classical physics are 
transparent and traceable to well-established experimen- 
tal facts and theoretical ideas, and since these assump- 
tions remain fundamental to the way in which we concep- 
tualize the physical world, we attempt to preserve them 
as far as possible in the face of quantum phenomena. Ac- 
cordingly, we draw the majority of the postulates from 
classical physics, either by taking fundamental features of 
the theoretical framework of classical physics and mod- 
ifying these, if necessary, in light of experimental facts 
that are characteristic of quantum phenomena, or by 
transposing particular features of the classical models of 
physical systems into the quantum realm via a classical- 
quantum correspondence argument. Furthermore, in our 
treatment of information, we use the standard inferential 
methods of probability theory, and employ the conceptu- 
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ally and mathematically well-established framework of 
Shannon information theory. The remaining assump- 
tions, which have no obvious classical counterparts, are 
based on experimental facts that are characteristic of 
quantum phenomena but have no classical analog, or are 
based on novel theoretical ideas and principles. 

In our discussion below, we shall divide the postu- 
lates into (i) postulates that are adopted from classical 
physics, or are modified therefrom in light of experimen- 
tal facts characteristic of quantum phenomena, (ii) pos- 
tulates that are obtained through a classical-quantum 
correspondence argument, and (iii) novel postulates with 
no classical counterparts. 

1. Postulates adopted from classical physics. 

A classical model of a physical system is based upon 
the partitioning, time and states background assump- 
tions given earlier, and these are adopted unchanged in 
the abstract quantum model. The classical model addi- 
tionally makes the following additional key assumptions: 

A Measurements. 

Al Operational Determinacy. The outcome of a 
measurement performed on the system is de- 
termined by experimentally-controllable vari- 
ables. 

A2 Continuum. The values of the possible out- 
comes of a measurement form a real-valued 
continuum. 

B States. 

Bl Determinacy. The state of the system and 
a theoretical description of a measurement 
that is performed on the system determine the 
measurement outcome. 

C Transformations. 

CO Mappings. Physical transformations of the 
system, either due to temporal evolution or 
due to a passive change of frame of reference, 
are represented by mappings over the space of 
states. 

CI One-to-one. The mappings are one-to-one. 

C2 Continuity. If a map represents a physi- 
cal transformation that depends continuously 
upon a real-valued set of parameters, then the 
map is continuously dependent upon these pa- 
rameters. 

C3 Continuous transformations. If a map rep- 
resents a continuous transformation (such as 
temporal evolution) that depends continu- 
ously upon a set of real- valued parameters, 
then, for some value of these parameters, the 
map reduces to the identity. 



We remark that the measurements mentioned in Al-2 
are idealized, fundamental measurements, such as mea- 
surements of the position of a particle, which, in the 
framework of classical physics, are assumed to yield a 
continuum of possible outcomes jsoj . Similarly, although 
fundamental measurements of a physical quantity in a 
particular situation (such as the frequency of a bound 
membrane) may take a discrete number of possible val- 
ues, it is assumed that the discreteness arises through 
the particular boundary conditions that are applicable, 
rather than being an intrinsic feature of the measure- 
ments themselves. 

We also remark that, in C0-C3, it is assumed that 
physical transformations of a physical system are deter- 
ministic and reversible, which prevents the description of 
irreversible or indeterministic transformations within the 
classical framework at a fundamental level. 

First, we consider those postulates which adopt clas- 
sical assumptions unchanged. Postulates 3 and 3.1 cor- 
respond, respectively, to assumptions CO and CI, while 
Postulate 3.3 is a combination of assumptions C2 and C3. 

Second, in light of the results of experiments involv- 
ing quantum systems (such as Stern-Gerlach measure- 
ments on silver atoms), it is reasonable to modify as- 
sumptions Al, A2 and Bl as follows: 

Al' Probabilistic operational determinacy. The data 
obtained when a measurement is performed on 
the system are best modeled by a probabilistic 
source whose outcome probabilities are determined 
by experimentally-controllable variables. 

A2' Finiteness. A measurement performed on a system 
has a finite number of possible outcomes. 

Bl' Probabilistic determinacy. The state of the system 
and a theoretical description of a measurement that 
is performed on the system only probabilistically de- 
termine the measurement outcome. 

We emphasize that, although these modifications are rea- 
sonable, they are not the only possibilities consistent with 
the experimental facts. For example, the probabilistic 
operational determinacy that one finds empirically can 
be accommodated in at least two ways. First, one can 
assume that the state of the system does, in fact, deter- 
mine the outcome of a measurement performed upon the 
system, but that one cannot, for some reason, control all 
of the relevant degrees of freedom of state. Second, one 
can assume that the degrees of freedom of the state only 
determine the probability that a measurement yields a 
particular value. In this instance, we have taken the lat- 
ter option. 

These modified assumptions are contained within Pos- 
tulates 1.1 and 2.1. Specifically, Postulate 1.1 contains 
assumption Al' and A2', while Postulate 2.1 incorporates 
assumption Bl'. 
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2. Postulates obtained through classical-quantum 
correspondence. 

A general guiding principle in building up a quantum 
model of a physical system is that, in an appropriate 
limit, the predictions of the quantum model of the sys- 
tem stand in some one-to-one correspondence with those 
of a classical model of the system. By establishing such a 
correspondence between the quantum and classical mod- 
els of a particle, we shall transpose several elementary 
properties of the classical model across to the quantum 
model and then, by generalization, to the abstract quan- 
tum model, q(iV)- 

Consider an experiment in which a position measure- 
ment is used to prepare a particle at time to, and a posi- 
tion measurement is subsequently performed at time ti, 
during which interval a potential V{f, t) is assumed to 
act. When such an experiment is actually performed, 
one necessarily uses position measurements with a finite 
number of possible outcomes. In this case, the exper- 
imental results (where, for instance, an electron passes 
through a sub-micron aperture, is subject to electric-field 
interactions, and is subsequently detected on a screen) 
support the conclusion that, if these coarse position mea- 
surements are of sufficiently high spatial resolution, the 
preparation is, to a very good approximation, complete 
with respect to the subsequent measurement. 

Suppose, then, that a coarse position measurement 
with N possible outcomes is used to implement both the 
preparation and measurement steps, and further let us 
suppose that the coarse measurement is such that the 
probability that a detection is obtained in any run of the 
experiment is very close to unity. Further, let us suppose 
that the coarse measurement is of sufficient resolution 
that the preparation can be regarded as being complete 
with respect to the measurement. Then we can form a 
quantum model, which we shall denote q*(A'^), within the 
framework of the abstract quantum model q(-/V), which 
approximately describes the experiment after time to. 

By Postulate 1.1 and the assumption Bl' above, the 
state, S{ti), of the system immediately prior to the 
coarse position measurement determines the probability 
n-tuple, -P(ti) = (Pi, . . . , P/v), where Pi is the probabil- 
ity of detection at the ith detector, which characterises 
the data obtained from the coarse position measurement. 

If the above experiment is repeated, except that the 
coarse position measurement is delayed until time t2, 
then S(ti), together with a theoretical representation 
of any interaction in the interval [ii,t2], must (by as- 
sumption Bl') enable the prediction of the probability 
n-tuple P{t2) that describes the coarse position measure- 
ment data obtained at time t2- To determine what ad- 
ditional degrees of freedom the state S{ti) must contain 
in order to make this prediction possible, consider the 
classical limit. 

Suppose that m is increased towards values character- 
istic of macroscopic bodies. Under the assumption made 
above, the preparation is complete with respect to the 



measurement, so that the system continues to be well- 
described by the model q(A^) even in this classical limit. 
However, as m tends towards macroscopic values, it is 
reasonable to expect that the system will increasingly 
behave in accordance with its classical model between 
times ti and t2. That is, in this classical limit, we expect 
that P{t2), which is determined in the quantum model in 
terms of P{ti) and the other degrees of freedom in S(ii), 
will coincide with the n-tuple P^™'>{t2) that is predicted 
by a classical model of a particle of mass m moving in 
the same potential. 

The relevant classical model in this situation is a par- 
ticle ensemble model. For such an ensemble model, one 
can choose to describe an ensemble for the case of given 
total energy by means of a probability density function 
over phase space, and to describe the evolution of this 
function using Newton's equations of motion. Alterna- 
tively, one can employ the Hamilton- Jacobi model, which 
is physically equivalent. We choose the latter since it is 
more easily described on a discrete spatial lattice. 

In the Hamilton-Jacobi model, the state of the en- 
semble is given by {P{r,t), S{r,t)), which satisfies the 
Hamilton-Jacobi equations, 

dt \m J ^g-j 

-(V5)VnM) = -^. 

In the case of coarse position measurements with N pos- 
sible outcomes, we shall use the discretized form of the 
Hamilton-Jacobi state, {P^^^^;Si), with i = 1,...,N, 
and with P(CM) ^ (p(CM)^ ^ ^ p(CM)^^ ^^^^^ ^(cm) 

the probability that the position measurement yields a 
detection at the ith measurement location, and Si is the 
classical action at the ith measurement location. 

In order that the predictions of the quantum and 
classical models agree in the classical limit, the quan- 
tum state S{t) {t > to) must contain degrees of free- 
dom which encode N quantities, which we shall de- 
note S'J'^^-', . . . , S^j^^\ which, in the classical limit, are 
equal to the Si . Equivalently, we shall assume that S con- 
tains N dimcnsionless real quantities, Xi; ■ • ■ iXn, such 
that S^'^^^ — oiXi, where a is a constant with dimen- 
sions of action. 

From the above discussion, in the model c^{N), the 

state, S, is given by (P,x), where x = (xi, • ■ • , Xw)- 
Postulate 2.1 directly generalizes this statement to the 
abstract model (\{N). 

We now observe that the Hamilton-Jacobi model has 
the following properties, which can be readily verified 
from Eq. (O: 

1. Invariance. The evolution of the 

state (p(™)(ii);S'i(ii)) to the 

state (P(CM)(i2);S'^(i2)) is such that /^(cm)^^^) 
is unchanged if an arbitrary real constant, 5*0, is 
added to each of the Si{ti). 
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2. Tem,poral Evolution. In a timc-indcpcmdcnt back- 
ground, a state, {P'^™\t); Si{t)) whose observable 
degrees of freedom are time-independent, evolves 
in time At to the state (P^'^^Ht); S,{t) - EAt), 
where E is the total energy of the system. 

3. Composite Systems. If, with respect to 
position measurements along the x and y 
axes, the Hamilton- Jacobi state of a particle 
is and {Pf^^y\sf^), respectively, 
then, with respect to xy-position measurements, its 

state is {Pf^''^''^\s\f) = ^pCM{.)pCMiy)^g(.)^ 

Furthermore, from the first property, since the zero- value 
of the Si is conventional and therefore has no physical 
correlate, the prior probability Pr(5'i|I) must be invari- 
ant under arbitrary changes of the zero- value of the Si, 
where I represents the state of knowledge of the experi- 
menter prior to performing a measurement on the system. 
The uniform prior is the only prior that has this invari- 
ance property. Therefore, the prior Pr(6'j|I) is uniform, 
which we shall list as a fourth property: 

4. Prior Probabilities. The prior Pr(S'j|I) is uni- 
form [i = 1, 2, . . . , iV), where / represents the state 
of knowledge; of the experimenter prior to perform- 
ing a measiiremcnt on the system. 

On the assumption of the above correspondence be- 
tween the Hamilton- Jacobi model and the model q*(A''), 
it is now possible to transpose these properties to the 
model q*(A'^) in the classical limit. 

For example. Postulate 2.4 is obtained as follows. 
First, by using the relation Pr(5'j|I)|<iS',| = Pr(xi|I)|dXi|) 
it follows that 

Pr(xi|I)|d5i/dxir'=Pr(.Si|I). (6) 

Then, using the correspondence relation that Si = axi 
in the classical limit, and noting that Pr(S'i|I) is uni- 
form (property 4, above), we conclude that, in the 
classical limit, the model q*(A/') satisfies the condition 
that Pr(xj|I) is a constant. Second, the assumption is 
made that this condition holds for the model q*(A^) not 
only in the classical limit but also for microscopic val- 
ues of m and, even more generally, that it holds for the 
abstract quantum model q(iV). 

Postulates 3.2, 3.4 and 5 are obtained in a similar man- 
ner by using the above correspondence, Si = aXi, to 
transpose the first three properties to the model q*(A^) 
in the classical limit, and then making the assumption 
that the transposed properties hold more generally for 
the abstract quantum model <l{N). 

3. Novel Postulates 

Below, we shall describe the four novel postulates, 
namely Postulates 1.2, 2.2, 2.3 and 4. 



Postulate 1.2: Representation of Measurem,ents. 
Consider an experiment in which Stern-Gerlach prepa- 
rations and measurements are performed upon silver 
atoms, and where the set A consists of the elements A^i^^ 
representing Stern-Gerlach measurements in the direc- 
tion {9,4)). In this experiment, if an interaction consist- 
ing of a uniform magnetic field acts between the prepara- 
tion and measurement, one finds that both the probabil- 
ities of the observed outcomes arc the same as would be 
obtained if a different measurement had been done with 
the solenoid absent. 

Using this observation, one finds that it is possible to 
implement the measurement Ae,^ using any given mea- 
surement A e .4 if followed immediately before and after 
by suitable interactions. The implementation behaves 
precisely as A^i^^ insofar as the probabilities of observ- 
able outcomes 1 and 2, and the corresponding output 
states, are concerned. Postulate 1.2 can be regarded as 
a plausible generalization of this observation. 

Postulate 2.2: Physical interpretation of the Xi- Ac- 
cording to Postulate 2.1, the state S{t), written with 
respect to some measurement A G yl, consists of the 
pair (P, x), where P contains the probabilities of the ob- 
served outcomes, and x is an ordered set of real- valued 
degrees of freedom. Hence, the state consists of a mix- 
ture of probabilities and degrees of freedom unconnected 
to probabilities. Postulate 2.2 is motivated by the aes- 
thetical desideratum that a quantum state consist, as far 
as possible, of probabilities of events, rather than being 
such a mixture. 

Accordingly, we postulate that Xi encodes the proba- 
bilities of some events, labeled a and b. Hence, when mea- 
surement A is performed on the system, one of 2N possi- 
ble outcomes is obtained, with probabilities determined 
by the state of the system. Since, by Postulate 1.1, the 
probabilities of the observed outcomes of measurement A 
are determined by the Pi, we are forced to postulate that, 
for some reason to be investigated later, the outcomes a 
and b are not observed by the experimenter. 

Now, we make the reasonable assumption that the 
abstract quantum framework being developed is capa- 
ble of modeling the behavior of a photon when sub- 
ject to polarization measurements, and that this model 
will agree with the predictions of electromagnetism un- 
der a particle interpretation. Now, an electromagnetic 
plane wave of constant amplitude moving along the +z- 
direction is described by the vector-valued function E = 
Eq{cos9 i + sm.0 j), and the information about the polar- 
ization of the wave is contained in (cos 9, sin 9) with re- 
spect to polarization measurements in the xy-plane. In 
the particle interpretation, the probability that a pho- 
ton will pass through a polarizer whose axis points along 
the a;-axis or y-axis is given by cos^ 9 or sin^ 9, respec- 
tively. The key feature which we wish to abstract from 
this example is that, since the map from (cos 9, sin 9) (the 
'state-level') to (cos^ 9, sw? 9) (the 'probability-level') is 
many-to-one, the computed probabilities are not the fun- 
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damental quantities when describing the state of the pho- 
ton. Rather, the more fundamental quantities are cos 6 
and sin0, which we can regard as square roots of prob- 
abihty in the range [—1, 1], which are squared to obtain 
probabihties. 

To incorporate this two-layered feature into the ab- 
stract quantum model, we assume that, following the 
realization of outcome a or h, one of two outcomes, la- 
beled -|- and — , is obtained. This ensures that one binary- 
valued degree of freedom is associated with each of the 
2N possible probabilistically-determined outcomes. Fur- 
thermore, we assume that the value of Xi determines 
whether -I- or — is obtained via the sign of either Q^\i 
or Qb\i^ depending upon whether a or & was obtained, 
where Pa\i = Ql^^ and P^^ = Q^.. In summary, the 
quantum state consists of the N probabilities Pi , . . . , Pn 

,Qa\N,Qb\N which 



-i\N: Wb\N 

Pa\N,Pb\N and en- 



and the 2N quantities Qa\i,Qb\i: 
encode the probabilities Pa\i, Pb\i, 
code the values of the 2N binary-valued degrees of free- 
dom. 

In Sec. IV A 1[ we sketch some ideas which help to pro- 
vide a better physical understanding of this postulate. 

Postulate 2.3: Principle of Information Gain. Pos- 
tulate 2.3 asserts that, in the arrangement of Fig. [1] if 
measurement A g ^ is performed on a system in any un- 
known state S(i), then, in n runs of the experiment, the 
amount of information provided by the probabilistically- 
determined outcomes (namely, one of 1, . . . , TV, followed 
by either a or b) about S(t) is independent of S(i) in 
the limit as n ^ oo. This postulate can be understood 
physically as follows. 

Suppose that, in trial 1 of n runs of an experiment, a 
measurement A is performed on a system in state S(t), 
and suppose that trial 2 is identical to trial 1 except that 
measurement A' is performed instead of A. Now, by 
Postulate 1.2, trial 2 is equivalent (insofar as the proba- 
bilities of the probabilistically-determined outcomes are 
concerned) to trial 2' consisting of n runs of an exper- 
iment where a system in state S(i) is sent through an 
arrangement consisting of a suitable physical interaction 
with the system, represented by map M (Postulate 3), 
followed by measurement A, followed by another physical 
interaction. 

The data obtained in trials 1 and 2 provides infor- 
mation (via the Shannon- Jaynes entropy functional, as 
we shall later detail) about S{t). Furthermore, since the 
data obtained in trials 2 and 2' is statistically identical (as 
ensured by Postulate 1.2), the amount of information ob- 
tained about S{t) in trial 2 is asymptotically equal to the 
amount of information obtained about S'{t) = M {S{t)) 
in trial 2' 

Now, suppose that, in one of the two trials 1 and 2, 
the data obtained yields more information about the 
state S(t) than in the other trial. This implies that, in 
the trials 1 and 2, one of the two measurements A and A' 
is privileged compared to the other insofar as the amount 
of information that it yields about S(i). Although this 



possibility cannot be ruled out a priori, we make the 
intuitively plausible assertion that, although these differ- 
ent measurements provide different perspectives on the 
system, these perspectives are not informationally priv- 
ileged. Postulate 2.3 ensures that the amount of infor- 
mation obtained in trials 1 and 2' is asymptotically equal 
and, therefore, that the amount obtained in trials 1 and 2 
is equal. That is. Postulate 2.3 can be understood as aris- 
ing from the requirement that no measurement in the 
measurement set provides an informationally privileged 
perspective on the system. 

In order to quantify the amount of information gained, 
the Shannon- Jaynes entropy functional (also known as 
the relative entropy) has been used (see Eq. which 
is the continuum generalization of the Shannon en- 
tropy ^l[. Although other discrete information mea- 
sures, such as the Renyi or Tsallis entropies [l^ [20j . 
have been proposed, the Shannon-Jaynes entropy is pre- 
ferred here since the Shannon entropy has the clearest 
axiomatic basis (being derivable from a set of intuitively 
reasonable postulates [2l|, [l^, H^) and has strong indi- 
rect support through applications in communication the- 
ory and through the many successes of the maximum 
entropy method (see [U, [2^, for example), of which it 
forms the basis. 

In Sec. IV A 21 we shall develop a better understand- 
ing of this postulate and describe some of its interesting 
consequences. 

Postulate 4- Consistency. A fundamental require- 
ment of a theoretical model is that it be internally con- 
sistent. That is, if it is possible to make a particular 
prediction via two distinct calculational pathways, the 
predictions obtained must agree. 

Postulate 4 considers the particular situation where 
one attempts to calculate a posterior probability distri- 
bution over state space on the basis of the objectively 
realized outcomes (see Postulate 2.2) in n runs of an ex- 
periment in which a measurement. A, is performed on a 
system. 

In particular, one can arrive at the posterior, p'{S), via 
two calculational pathways: 



Measurement A 



Map M 



Map M* 



S' = A^(S) 

Measurement A 



In the first route, in a given run of the experiment, state S 
is first transformed to state S' — A^(S), and then one 
performs measurement A on the system. On the basis of 
the data obtained in n runs, one then calculates a pos- 
terior probability distribution over state space. In the 
second route, in a given run, one first performs the mea- 
surement on the system in state S. On the basis of the 
data obtained in n runs, one calculates a posterior, p(S), 
over state space, and then transforms this posterior using 
the map M * , which is determined by A4 . 
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Although these two calculational routes cannot be ex- 
pected to agree for finite n owing to statistical fluctua- 
tions, consistency requires that they agree (so that the 
above diagram commutes) in the limit as n oo. 



IV. DEDUCTION OF THE QUANTUM 
FORMALISM 



In this section, we shall use the postulates described 
above to derive the explicit form of the abstract quantum 
model q(A^), apart from the representation of temporal 
evolution (which is derived in Paper II) . We shall also de- 
rive the composite systems rule which allows the abstract 
quantum model of a composite system to be related to 
the abstract quantum models of its component systems. 

The derivation will proceed as follows. First, in 
Sec. IIV Al we shall explore the consequences of Postu- 
late 2.3, the principle of information gain. We shall 
find that, if an information gain condition applies to 
a probabilistic source with probability n-tuple P ~ 
{Pi,P2,...,Pm) (M > 2), then, if P is represented 
as a unit vector, Q = {\/Pi,^/P2, ■ ■ ■ ,VPm), in a 
real 'square-root of probability' space (or Q-space), the 
prior Pr((5|I) is uniform over the positive orthant of the 
unit hypcrsphcrc in this space. 

Second, following Postulates 1.1, 2.1, and 2.2, we shall 
represent the state of a system, S(i), in a 2Ar-dimensional 
Q-space, Q^^ . We shall then use Postulate 2.4 to deter- 
mine the form of the function / that is introduced in the 
postulates. 

Third, in Sec. IIV B| we shall use Postulates 3, 3.1, 3.2, 
3.3 and 4 in order to obtain a representation of phys- 
ical transformations of a system. We shall find that 
such transformations can be represented by a subset of 
the orthogonal transformations of the unit hypersphere 
in Q^^. We shall then show that these transformations 
can, equivalently, be represented by the set of unitary 
and antiunitary transformations of a suitably-defined iV- 
dimensional complex vector space. 

Fourth, in Sec. IIV CI we shall draw upon Postulate 1.2 
in order to obtain a representation of measurements on 
a system. 

Fifth, in Sec. IIVDI we shall use Postulate 5 to obtain 
a rule, the composite system rule, which determines the 
state of a composite system in terms of the states of its 
sub-systems. 



A. Probabilistic Sources and Information Gain 

By postulates 1.1, 2.1 and 2.2, the measurement A on 
the system in state S{t) can, with respect to the outcomes 
labeled i and a or b, be modeled as the interrogation of a 
2iV-outcome probabilistic source with probability n-tuple 



From Postulate 2.2, f has range [—1, 1], so that all possi- 
ble values of P can be obtained by varying the state S{t). 
From Postulate 2.3, it therefore follows that, when this 
probabilistic source with any given P is interrogated n 
times, the amount of Shannon-Jaynes information ob- 
tained about P by an experimenter who does not know 
the value of P is independent of P in the limit as n — s- oo. 
In order to implement this condition, we shall begin by 
examining the process by which information is gained 
about a probabilistic source. 



1. Information gain from a probabilistic source. 

Consider an experiment in which an M-outcome 
probabilistic source, with probability n-tuple P = 
{Pi, P2, . . . , Pm), is interrogated n times, yielding the 
data string, £)„ = aia2 ■ ■ . a„, of length n, where Ur rep- 
resents the value of the rth outcome (r = 1, . . . ,n). 

Let us suppose that an experimenter knows that the 
data is obtained from a probabilistic source, but does 
not the value of P. Since the experimenter knows that 
the data is generated by a probabilistic source, the or- 
der of the Ur is irrelevant, the only relevant data be- 
ing the number of instances, nii of each outcome, i {i = 
1, . . . , M), which can be encoded in the data n-tuple rh — 

(mi, 7712, . . . , Tom), or, equivalently, in the pair {f,n), 
where f = rfi/nis the frequency n-tuple. 

The experimenter's knowledge about P prior to the ex- 
periment can be expressed as the prior probability den- 
sity function Pr(P|I), where I symbolizes the knowledge 
that the experimenter possesses prior to performing the 
interrogations. 

After obtaining the data {f,n), the experimenter's 
state of knowledge about P is represented by the poste- 
rior probability density function, Pr(P|/, 71, 1). The pos- 
terior can be related to the prior using Bayes' theorem, 

Pr(P|/,7.,I)^P-(^>^'-'^)P-(^'-'^ (8) 
Pr(/Kl) 

where the function Pr(/|P, n, I), known as the likelihood, 
is given by 



Pr(/|P,7i,I) 



-P-f^...Pll^-. (9) 



(77/1)!... (77/m)! 

The function Pr(/|77, 1) can be obtained from the relation 



Pr(/|77,I)= / ••• / Pr(/|P,77,/)Pr(P|77,/)ciPi...dPw, 

(10) 

where R is the set of P satisfying the conditions < P; < 
1 (i = 1, . . . , N) and P^ = 1. In addition, from Bayes' 
theorem. 



(PlP,|l, PlPfell, . . . , PNPa\N, PnPi 



b\N) 



(7) 



Pr(P|77, 1) Pr(77|I) = Pr(77|P, I) Pr(P|I), (11) 
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and, using the fact that n is chosen freely by the 
experimenter and therefore cannot depend upon P, 
which imphes that Pr(n|P,I) — Pr(n|I), it follows 
that Pr(P|n,I) = Pr(P|I). 

In order to quantify the experimenter's change in 
knowledge about P, we employ the Shannon-Jaynes 
information, which is defined as follows. First, the 
Shannon-Jaynes entropy functional, 

H[P(P)] / / F(P)\^^^dP^dP■,...dPN. 
J JR Pr(P|l) 

(12) 

is used to quantify the change in the experimenter's 
uncertainty, AiJ, about P as a result of obtaining 
the data {f,n). The experimenter's gain of Shannon- 
Jaynes information about P is then defined as AK — 
^AH, which quantifies the decrease in the experi- 
menter's uncertainty (equivalently, the increase in the ex- 
perimenter's knowledge) about P as a result of obtaining 
the data {f,n). The experimenter's gain of information 
about P is therefore given by 

AK = (Initial uncertainty about P) 

— (Final uncertainty about P) 
= H[Pr(P|I)] -H[Pr(P|/,n,I)] 

Pr(P|/, n, I) In ^'^(-^1/' ^' I) rfp 
R Pr(P|I) 

(13) 

where we have used the fact that H[Pr(P|I)] 0. 

From this expression, one can see that, for 
given Pr(P|/,n,I), the value of AK depends upon the 
prior probability, Pr(P|I). However, this prior is left un- 
determined by the theory of probability. For concrete- 
ness, consider the case where M — 2. In that case, the 
likelihood is given by 



Pr(/|P,n,I) 



rai\{n — mi)! 



P™H1-Pi)"-™S (14) 



which, in the limit of large n, becomes very sharply 
peaked around mi = nPi so that, in Eq. (jlOp. the prior 
probability, Pr(P|I), factors out of the integrand, which, 
from Eq. ([8|), implies that the posterior Pr(P|/,n,I) can 
be approximated by 



Pr(P|/,n,I) 



Pr(/|P,n,I) 



/•••/^Pr(/|P,n,I) dPi...dP, 



(15) 



N 



Consequently, the posterior Pr(Pi|/,n, I) can be approx- 
imated by a Gaussian function of variance — /i(l — 
fi)/n. 

For the purpose of illustration, suppose the prior prob- 
ability Pr(P|I) is chosen to be uniform on J^i Pi = 1j so 



that Pr(Pi|I) = 1. Then Eq. ^ becomes 
Ai^ = /Pr(P.|/,n,I)ln^fL^.P, 
Pr(Pi I/, n, I) In Pr(Pi |/, n, I) dP^ 



J Pr(Pi|/,n,I) lnPr(Pi|I) dPi 



(16) 



where we have made use of the standard result that, for 
a Gaussian Gp,^a{x) over x, with mean /i and standard 
deviation a, the integral 



(17) 



Equation (|T6l) clearly shows that the value of AK is de- 
pendent upon /i. In the limit of large n, /i tends to Pi. 
Thus, with the above choice of the prior, the amount 
of information that the data provides about P depends 
upon the value of P. This observation raises the possibil- 
ity that one may be able to choose Pr(P|I) in such a way 
that AK is independent of Pi in the limit as n — > c». 

Let us then suppose that an A/-outcome probabilistic 
source has a prior Pr(P|I) such that the following con- 
dition holds: 

Information Gain Condition. The amount of Shannon- 
Jaynes information obtained about P in n interrogations 
is independent of P for all P. 

In order to implement this condition, we can make use 
of the fact the Shannon-Jaynes entropy is invariant un- 
der a change of variables [20| . To illustrate the essential 
idea underlying the implementation, we shall first give a 
simplified argument for the case where M = 2; a more 
rigorous and general argument is given in the appendix. 

Simplified argument for case M — 2. Suppose 
that P = (Pi,P2) is parameterized by the parame- 
ter Ai, where the parametrization is bijective over some 
interval, [aJ"'^\ Ai^"*], of Ai, and is differentiable. Let us 
set Pr(Ai|I) equal to a constant (fixed by normalization) 

over [Ai^\Ai^^], and zero otherwise. 

As stated above, in the limit of large n, the poste- 
rior Pr(Pi|I) takes the form of a Gaussian with mean /i 
and standard deviation a. Similarly, as we shall later 
show explicitly, the posterior Pr(Ai|/,n,I) in this limit 
also takes the form of a Gaussian distribution, with 
mean A^"^ defined through the relation /i = Pi{\^i^). To 
find the standard deviation, cr', of the posterior over Ai, 
we use the relation Pi = Pi(Ai), 



(18) 
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so that 



If we parameterize Q as 



dPi 



(19) 



Using the expression for ct', the gain of information 
about Ai (and hence about P) is given by 

^K = j Pr(Ai I/, n, I) ha ^^^^^^^ 

Pr(Ai|/,n,I) hiPr(Ai|/,n,I) dAi 



Pr(Ai|/,n,I) InPr(AilI) dAi 



ln(cr'\/2^) - In (Pr(Ai|I)) 



(20) 



= In 



dPi 



dXi 



iln(^)-ln(Pr(A,|I)). 



From this expression, one can see that the information 
gain will be independent of Ai (and therefore independent 
of Pi) in the hmit as n — s- cx) if and only if 



dPi 



dXi 



1 



2a, 



(21) 



where a is a real constant and is non-zero since Pi(Ai) is 
invertible, which implies that 



Pi = cos^ (aAi + b) , 



(22) 



where b is some real constant. Finally, from that fact 
that Pr(Ai|I) is a constant, using the relation 



Pr(Pi|I)|dPi| =Pr(Ai|I)|dAi|, 



one finds that 



Pr(Pi|I) 



^ V^i(l-Pi) 



(23) 



(24) 



Hence, the above argument leads to the conclusion that 
the information gain condition is satisfied for the case 
where M = 2 if and only if the prior Pr(Pi|I) takes the 
above form. Furthermore, from Eqs. and (PT|) . it 
follows from the expression for a that 



1 



2a 



(25) 



Hence, that posterior over Ai takes the form of a Gaus- 
sian distribution whose standard deviation is indepen- 
dent of a'"' and hence independent of P. 

These results can be represented visually as follows. 
Define = y/V, {0 < Q, < I, i = 1,2), and take Q = 
{QiiQi) to be a vector in a two-dimensional real Eu- 
clidean space. Then, from Eq. P^ . it follows that 



(26) 



Qi = cos (aAi -I- b) . 



Q — {cos 9, sin( 



(27) 



with 9 e [0,7r/2], we obtain that 9 — aXi + b. 
Since Pr(Ai|I) is a constant, it follows from the relation 



Pr(Ai|I)|dAi| =Pr(0|I)|d^| 



(28) 



that Pr(6'|I) is also a constant. Hence, the prior over 9 is 
uniform over [0,7r/2]. Conversely, if Pr(6'|I) is uniform, it 
follows from Eq. (P7)l that the prior over Pi is that given 
in Eq. ([M]) . Hence, the statement that the prior over Pi 
is that given in Eq. (|24p is equivalent to the statement 
that the prior is uniform over the positive quadrant of 
the unit circle in Q^. 

We note also that, from Eq. (|25p . using the rela- 
tion 9 = aXi + b and Eq. ([28]), it follows that the pos- 
terior, Pr(6'|/, n, I), over 9 takes the form of a Gaussian 
with standard deviation erg = l/2-\/n. 

Statement of the general result. As shown in the ap- 
pendix, the above results for M — 2 generalize as follows. 
For an M-outcome probabilistic source, the information 
gain condition is satisfied if and only if 



Pr(P|I) = 



1 



Am-1 V a . . . , Pi 



M 




(29) 



where Am-i is the surface area of a unit M-ball. 

Consider an M-dimensional real Euclidean space, Q*^, 
with axes Qi, Q2, • ■ ■ , Qm- If we define the vector Q — 
(Q17 Q21 ■ ■ ■ 1 Qm) such that Qi = \/^, where Q < Qi < 
1, then every Q that represents a probability n-tuple 
lies on the positive orthant, 5*^"^, of the unit hyper- 
sphere, 5*^"^. Then, using the relation 



Pr(0|I) 



5(Pi,. 



,P, 



M) 



d{Qi,. 



Pr(P|I), 



it follows that the prior over Q is given by 
2M+1 



Pr(Q|I) 



A 



M- 



(30) 



(31) 



which implies that the prior is uniform over ^ . Con- 
versely, if the prior is uniform over , it follows that 
the prior over P is that given in Eq. Finally, in the 
limit as n — > 00, the posterior over is a symmetric 
Gaussian with standard deviation l/2y/n. 



2. Prior Probabilities over P 

From the above discussion, it follows that Postulate 2.3 
imposes a particular prior over P (see Eq. ([7])), namely 

Pr(P|I) = -^ ^ , Wl-^PJ, (32) 

A2N-I y/P,...P2N V ^ / 
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where Pq denotes the gth component of P. As in the 
previous section, we shall describe P as a unit vector, 



Q — (Qi, Q2, • ■ • , Q2n) 



and 0<Qq<l. 



(33) 



in Q , where Qq - 

From the results of the previous section, the prior over 
the positive orthant of the unit hypersphere is uniform 
and, after obtaining the data from n runs of the experi- 
ment, in the limit as n —^ 00, the posterior can be rep- 
resented by a symmetric Gaussian distribution over the 
positive orthant, with standard deviation Xjl^/n. 



3. Determination of function f 

In order to determine the unknown function / which 
is introduced in Postulate 2.2, we shall first use the 
prior over P to determine the priors Pr(P(j|i|I) (* ~ 
1,...,A^), and then use the relationship Pg^\i — F{xi), 
where F{xi) — /^(Xi) (Postulate 2.2) and the uniformity 
of the prior Pr(xi|I) (Postulate 2.4) to determine /. 

To determine the prior Pr(Pa|i|I), the first step is to 
find the prior Pr(Pi, Pa\i, . • . , Pn, Pa\N) using the prior 
in Eq. ([32)) . where, from Eq. and using the fact 

that Pa\i + Pb\i = 1, 



FiPa\i 



P2^^P^{l-Pa\^), 

for i = 1, . . . , iV. Using the relation 

Pr(Pi,P,|i,...,Pjv,Pa|Ar|I) = 

d{Pi,P2....,P2N-l.P2N) 



(34) 
(35) 



S(Pl,P,|i,...,PjV,Pa| 



N) 



Pr(P|I), (36) 



in which the modulus of Jacobian evaluates to Pi , we 
find 



Pr(Pi,P,|i,...,Pjv,Pa|Ar|I) 



A 



N-1 



(37) 



Next, to find the marginal probability over Pa\i, we 
first marginalize over Pi, ... , P^r, to obtain 



^ 1 

Pr(P„|i,...,P,|^|I) = []- 



1 



and then marginalize over Pq|i, . . . , Pa\t-i, Pa\i+i, 
to obtain 



(38) 



,P 



a\Ni 



1 



Pr(P,Jl) = - 



1 



(39) 



From Postulate 2.2, the probability Pa\i = F{xi), and, 
from Postulate 2.4, the prior Pr(xi|I) is uniform. Using 
Eq. ([39|l and the relation 



Pr(P,Jl)|dP,JcxPr(x.|I)Mx^l, 



(40) 



where the proportionality is due to the fact that the 
prior Pr(xj|I) is non-normalizable, it follows that 

(X v/F(x.)(l-F(x.)), (41) 



which has the general solution 



P(X.) = cos2 (axi + 6), 



(42) 



where a and b are real constants, and where a 7^ since, 
by Postulate 2.2, the function f{xi) is not a constant 
function. Hence, the functions / and / (see Postulate 2.2) 
have the form 



(43) 



fiXi) ^±cos{aXi+b) 
fiXi) = ±sm{axz + b), 

where the signs of / and / are undetermined. 

4- Representation of state space. 



Above, we have represented P as a unit vector, Q, 
on the positive orthant of the unit hypersphere in Q^^ . 
Now, the binary- valued degrees of freedom in S{t) de- 
scribed in Postulate 2.2 are encoded into the signs of 
the Qa]i and Q^ii. Therefore, if we remove the condi- 
tion of positivity imposed on the Qq, then, given Q on 
the unit hypersphere, S^^~^, the probabilities Pq can 
be read out using the relation P^ = Q^, and the values 
of the 2A'^ binary degrees of freedom are read out from 
the 2A^ signs (either -I- or — ) of the Qq. Graphically, the 
orthant containing Q encodes the values of the binary de- 
grees of freedom, while the location of Q within a given 
orthant encodes the values of the Pq . 

According to Postulate 2.2, P and the values of the 
2N binary degrees of freedom constitute all of the in- 
formation that the quantum state, S{t), of the system 
provides about objectively realized physical events when 
measurement A is performed on the system. Therefore, 
the value of P and the values of the binary degrees of 
freedom can be taken to completely represent S{t) with 
respect to measurement A. 

In particular, Q in S^^~^ represents the state S{t), 
where now the only condition imposed on the Qq is 
that Pq ^ Q'^q for q = l,..., 2N. Hence, the set, 
of unit vectors in represents the state space of the 
system. 

Using the functions / and / from Eqs. (|43p. tak- 
ing a — I and 6 = and choosing the positive signs, we 
can write Qaii = cosxi and Q^ii = sinxi, and therefore 
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can write the state of a system with respect to measure- 
ment A as 



1. Step 1: Orthogonal Transformations 



Q — (V PlQa\l; V PlQb\l^ : ■ ■ 




PnSit\xn) 



(44) 



In Paper II, we shaU show that the above choice of the 
positive signs for the functions / and / and choice of the 
constants a, b involves no loss of generality. 

The prior over S'^^~^ is the product of the priors due to 
the binary degrees of freedom and due to P. Since Qi = 
v'^'icosxi and Pr(xi|I) is uniform, it follows that the 
sign of Qi is a priori equally likely to be positive or 
negative, and similarly for (32, • ■ ■ ,Q2N- Therefore, each 
orthant is, a priori^ equally likely to contain Q. Since 
the prior due to P is expressed by a uniform prior over 
the positive orthant, the resultant prior over S'^^~^ is 
uniform. 

In the case of the posterior over S"^^"^, the orthant 
containing Q is known with a probability very close to 
unity in the limit of large n. Therefore, the posterior 
over S'^^~^ in the limit as n ^ oo is arbitrarily well ap- 
proximated by a probability density function that con- 
sists of a symmetric Gaussian in the orthant contain- 
ing Q, and is zero in all other orthants. 



B. Mappings 

According to Postulate 3, a physical transformation 
of a physical system is represented by a map, Al, from 
state space to itself. In this section, the general form of 
mappings that are consistent with the postulates will be 
determined. 

The derivation will be based upon Postulates 3.1-3.3 
and Postulate 4, and will proceed in four steps: 

(1) Show that Postulates 3.1 and 4 imply that AI is an 
orthogonal transformation of the unit hypersphere 
in Q2iv. 

(2) Show that the imposition of Postulate 3.2 re- 
stricts Al to a subset of the set of orthogonal trans- 
formations, and that these transformations can be 
recast as unitary or antiunitary transformations 
acting on a suitably-defined complex vector space. 

(3) Show that any unitary or antiunitary transforma- 
tion represents an orthogonal transformation satis- 
fying Postulates 3.1, 3.2, and 4. 

(4) Show that a physical transformation which depends 
continuously upon a real-valued parameter n-tuple 
can be represented by either unitary or antiunitary 
transformations, that a continuous physical trans- 
formation can only be represented by unitary trans- 
formations, and that a discrete transformation can 
be represented by either a unitary or an antiunitary 
transformation. 



As discussed in Sec. IIVA4[ the state space of a sys- 
tem can be represented by the set of unit vectors, 5"^^"^, 
in the 2A^-dimensional space Q^^ . According to Pos- 
tulate 3.1, the map AI over state space is one-to-one. 
Hence, the map over S*^^^^, which we shall denote by T, 
is one-to-one. 

We can now impose two further constraints on T. 
First, we have found that the prior, Pr(Q|I), is uniform 
over the unit hypersphere. Under map T, the prior trans- 
forms into the probability density function, p(Q'), given 
by 



m') = Pr(Q|I) 



d{Qi, . . . , Q2n) 



9(gi,...,Q 



2N) 



(45) 



where Q' = T(Q), with Q = (Qi, . . . , Q2n) and Q' = 
(Q'l, . . . , Q'2n)- However, under the physical transforma- 
tion represented by T, no measurement has been per- 
formed by the experimenter and therefore the prior as- 
signed by the experimenter over the unit hypersphere 
must remain unchanged. That is, the map, T must be 
such that p(Q') is also uniform over the unit hypersphere, 
which implies that 



9(Q'i,. 



i2N) 



d{Qi,. 



12N] 



(46) 



Hence, in general, under T, the probability density func- 
tion p(Q) transforms to the probability density function 



p(Q') =p(Q). 



(47) 



Second, from Postulate 4, we can, in the limit as n ^ 
oo, obtain a posterior over Q^^ of a system in state Q' = 
T(Q) in one of two equivalent ways: 

(i) perform measurement A. ^ A upon n copies of a 
system in state Q, and then use T to transform 
the posterior Pr(Q|Z3„,I) based on the data, Z3„, 
consisting of the realized outcomes, or 

(ii) perform measurement K ^ A upon n copies of 
a system in state Q', and write down the poste- 
rior Pr(Q|D'j,I) based on the data, Z?^, consisting 
of the realized outcomes. 



Now, from the discussion of Sec. IIV A 11 in the limit 
as n — !■ cx), the posterior, which we shall denote by /i, over 
the unit hypersphere in Q^^, is zero apart from in one 
orthant, where it takes the form of a symmetric Gaussian 
function whose standard deviation is a function of n only. 
Therefore, the posteriors Pr(Q|Z?„,I) and Pr(Q|Z3^,I) 
are both of this form, with the symmetric Gaussian func- 
tions having the same standard deviation. In order that 
Postulate 4 holds for any measurement K ^ A and for 
any possible interaction in X, it therefore follows that, 
in addition to satisfying Eq. (j47p . the map T must sat- 
isfy the condition that any probability density function of 
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the form containing a symmetric Gaussian with given 
standard deviation, is mapped to a probabiUty density 
function which is asymptotically equal to a probability 
density function of the form h that contains a symmetric 
Gaussian with the same standard deviation. 

One can readily see that any orthogonal transforma- 
tion of the unit hypersphere will satisfy this condition 
since such a transformation will take a symmetric Gaus- 
sian with given standard derivation to another symmetric 
Gaussian with the same standard derivation. We shall 
now show that, in fact, the set of all T is precisely equal 
to the set of orthogonal transformations over S'^^~^ 

First, we shall show that, in order to satisfy the above 
condition, the map T must preserve the distance between 
any two points that lie in the same orthant on the unit 
hypersphere. To see this, consider the converse. Sup- 
pose, then, that there exist two points, Qi,Q2 on the 
same orthant of the hypersphere such that d(Qi, Q2) 7^ 
d(Q'i,Q2) where primes indicate vectors transformed 
by T, and where d(Qi,Q2) denotes the distance be- 
tween Qi and Q2 according to some given distance func- 
tion, d. Choose a function h containing a symmetric 
Gaussian function which peaks at Qi, and define the 
set Q'^^^ as the set of all points in the orthant at a dis- 
tance r = d(Qi, Q2) from Qi. 

Since the Gaussian is symmetric about Qi, /i(Qa) = 
/i(Qfc) for aU Qa,Q6 G Q^''^- Therefore, Q^'') is a sub- 
set of a 2( AT — l)-spherical equiprobability contour cen- 
tered around Qi of radius r. Since ft(Q2) — ^(Qi) de- 
creases monotonically with d(Qi,Q2), Q'''^ contains all 
the points in the orthant with the value g{Q,2)- 

Under the mapping T, the points Q'j^ , QJ, are such 
that /i(Q'i) = /i(Qi) and /i(Q2) — h{Q,2), where h 
is the transformed posterior, so that Q^''' maps to the 
equiprobability contour Q'f''). Now, by assumption, T 
maps h onto a function, h, that asymptotically ap- 
proaches a probability density function of the same form 
as h. Therefore, in particular, T must preserve the shape 
of the Gaussian function and its equiprobability contours. 
However, it was supposed that d(Q^,Q2) 7^ r. There- 
fore, Q'^*^) contains a point, Q2, that is not a distance r 
from Qi. Therefore, unlike Q^''^ the set is not a 
subset of a 2{N — l)-spherical equiprobability contour of 
radius r, which leads to a contradiction. Therefore, the 
original supposition must be false, which implies that T 
preserves the distance between any two points Qi,Q2 
that lie in the same orthant of the hypersphere. 

In the case of two points that lie in different or- 
thants, we argue as follows. Consider first the simplest 
case where two points, Qi,Q2, lie in adjacent orthants 
and N = 2. Now, choose two points Qi,Q2, that lie 
in the first and second orthants, respectively. From the 
above result, the distances d(Qi, Q'^) and d(Q2, Q2) are 
preserved under T. Suppose now that the points Q'l , Q2 
are brought closer together, whilst still remaining in their 
respective orthants. In the limit as d(Q'j, Q2) — > such 
that Q'l , Q2 tend to the point Q' that lies on the bound- 
ary between the two orthants, it follows that the dis- 



tances d(Qi, Q') and d(Q2, Q') are preserved under T. 

Similarly, one can choose two further pairs of 
points, Q'/,Q2 and Q'/',Q2", that lie in the first and 
second octants respectively, and conclude that, if they 
tend to the points Q",Q"', respectively, which both lie 
on the boundary between the two orthants, the dis- 
tances d(Qj,Q") and d(Q,, Q"'),for i = 1,2, are also 
preserved under T. Let us now choose Q', Q", Q'" to 
be distinct points. Since the distances of Qi and Q2 
from Q', Q", Q'" are all invariant under T, it follows that 
the distance d(Qi, Q2) is invariant. 

The above argument can be readily generalized to the 
case of two points in adjacent orthants for general A'', 
and, further, to the case where two points are in non- 
adjacent orthants. 

Second, since T preserves the distance between any 
two points on the hypersphere, it is an orthogonal trans- 
formation of S'^^~^. But we have already noted that 
any orthogonal transformation of S'^^~^ is an acceptable 
map T. Hence, the set of all T is equal to the set of 
orthogonal transformations of S'^^~^. 



2. Step 2: Imposition of Postulate 3.2 

Postulate 3.2 requires that the outcome probabili- 
ties P{, P2, . . . , Pj^ of measurement A performed on a 
system in state Q' = T(Q) are unaffected if, in the 
state Q written down with respect to measurement A, 
an arbitrary real constant, xo; is added to each of the Xi- 

Since T is an orthogonal transformation, it can be rep- 
resented by the 2N dimensional orthogonal matrix, M. 
Under its action, the vector Q transforms as 

Q' = MQ. (48) 

Multiplying this out, the form of P^ in terms of the Pj 
and Xi is 

-Pfe = ^ -Pj [(M2fe_i,2i-i COS Xi + M2k-i,2i sin Xif 

i 

+ iM2k.2i-i cosxi + M2k,2i sinxi)^] 
-I- 2 ^ a/ PiPj [Akij cos Xi COS Xj + Bkij cos Xi sin Xj 

+ Ckij sin Xi cos Xj + Dkij sin Xi sin Xj] , 

(49) 

where 

Akij = M2k-l,2i-lM2k-l,2j-l + M2k,2i-lM2k,2j-l 
Bkij = M2k-l,2i-lM2k-l,2j + M2k,2i-lM2k,2j 
Ckij = M2k-l,2iM2k-l,2j-l + M2k,2iM2k,2j-l 
Dkij = M2k-l,2iM2k-l,2j + M2k,2iM2k,2j- 

(50) 

In order to implement Postulate 3.2, it is helpful to 

rewrite the above expression for P^ so that the Xi appear 
in the form {xi ± Xj) since the value of terms of the 
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form {xi — Xj) remains unchanged under the addition 
of Xo to each of the Xi- One finds that 

i 

+ \/P^Pj[i^kij + Dk,j)cos{xt - Xj) 



1,3 



{Bktj - Cktj) sin(x., - Xj)] 



Yco^i{x^ 



Xim) 



sa\{xi +Xi®i) 



+ IktPi cos{xt - Xim) 

X! V P^P] [{Akij - Dk^J)cOs{x^ + Xj) 



2J 

l<j 



+ {Bkij + Ckij)sm{xi + Xj)] 



where 

CUki 

Pki 
Iki 



^2k-1.2i-l 
M|k-1.2^ 



M. 



2fe,2i-l 



(51) 



(52) 



M2k-1.2t-lM2k~l,2i + M2k,2i-lM; 



2k,2i 



and ® denotes addition modulo N. 

Postulate 3.2 must hold for any Pi and Xi- Therefore, 
in particular, it must be true for the special case where 
all but one, say Pi, of the Pi are zero and all of the Xi 
have the same value. In this case, Eq. (I5T|) simplifies to 



P' 1 t 



-Pk^) 



+ 2 i^ki ~ Pkt) cos(xj + Xiei) + Iki sin(xi + X^ei)- 

(53) 

We require that Pj^ remains unchanged as a result of 
the addition of any constant xo S to the Xi- How- 
ever, a linear combination of the functions cos(xi +Xiei) 
and sin(xi + Xim) which at least one of the coeffi- 
cients is non-zero is zero only on a discrete set of points. 
Therefore, the coefficients of the functions cos(xi + Xiei) 
and sin(xi + Xiei) must vanish, so that the conditions 



o-ki = Pki and "fki = for alH, /c 



(54) 



must hold. 

Consider now a second special case where two of the Pi , 
say Pi and Pj [i ^ j) are set equal to 1/2, and the 



remainder are set to zero. Then, taking into account the 
above conditions, Eq. ([5T|) reduces to 



-^{akt - Pki)Pi cos(xj - Xi®i) 



7fcjPi sin(xz - X»ei) 



2(afcj - Pki)Pi sin(xj - X»ei) 



(55) 



{Akij + Dkij) cosixi - Xj) 

- (Bkij - Ckij) sin(xi - Xj) 

{Aktj - Dktj)coa{xi + Xj) 

+ {Bktj + Cktj) sin(xi + Xj) 



Once again, in order that P^ remains unchanged as a 
result of the addition of xo G K to the Xi i the coefficients 
of the functions cos(xi + Xj)and sin(xi + Xj) must vanish, 
so that a second set of conditions, 

Akij — Dkij and Bkij = ^Ckij 

for all i,i and fc, with i ^ j, (56) 

must hold. 

The most general matrix, M, which satisfies the first 
set of conditions, expressed in Eqs. ([5^ . can be written in 
the form of a N-hy-N array of two- by- two sub-matrices, 



M = 



2.(21) y(22) 



rp{2N) 



(57) 



yrp(Nl) rp(N2) rp{NN) j 



where 



/a.; 



/ c 



'■'-Pij 



^Vij 



ysin ifiij aij cos (fiij 



is a two-by-two matrix composed of a enlargement ma- 
trix (scale factor y/oij) and a rotation matrix if cr.y = 1 
or a reficction-rotation matrix (that is, a matrix repre- 
senting a reflection followed by rotation) if cry = — 1, 
with rotation angle ipij in either case. 

In terms of the aij and the a^ , Eqs. (|50)) then becomes 



Akij = ^/okiOkj (cos ipki cos ipkj + sin ipki sin ipkj) 

Bkij = CTfcj y/okiOkj (- cos ipki sin ifkj + sin (pki cos fkj ) 

Ckij = akiy/okiOkj {-sintpki cos ipkj + cosipkisinipkj) 

Dkij = <Jki<Jkj^Jctkiakj {sin^pki siwLpkj + cos ipki cos ipkj) . 

(58) 

In order to satisfy the second set of conditions, expressed 
in Eqs. (|56p . one finds that, for all i,j and fc, either aki = 
Gkj or akiOLkj = must hold. Hence, when written in 
the form in Eq. (|57p . the non-zero T sub- matrices in a 
given row of M are either all scale-rotation or all scale- 
refiection-rotation matrices. 

Since M represents the mapping, M., and, by Pos- 
tulate 3.1, A^~^ exists, the matrix M^^ represents the 
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mapping Ai~^. Hence, the matrix = M^, must 

also satisfy Postulate 3.2. Now, from Eq. ([57]) . the ma- 
trix M'^ takes the form 



(T(12) 



(r(2i)) 

(T(22)) 



(59) 



In order to satisfy Postulate 3.2, the non-zero sub- 
matrices of in a given row are either all scale-rotation 
or all scale-reflection-rotation matrices. But this implies 
that, in M, the non-zero T sub- matrices in a given col- 
umn are either all scale-rotation or all scale-reflection- 
rotation matrices. Hence, the non-zero T sub-matrices 
that compose the matrix M are either all scale-rotation 
or all scale-reflection-rotation matrices. 

Recasting M as a complex transformation At this 
point, it is convenient to recast the effect of M on the 
state in a complex form. Let the complex form of the 
state, Q, be defined as 



/ Qi+iQ2 
Q3 + iQi 



(60) 



2N-1 



+ iQ 



2N/ 



and let us suppose that the v are vectors in a com- 
plex vector space with inner product, (u,v) = '^iU*Vi 
and norm |v| = y/ (v, v). Consider the action of the A^- 
dimensional complex matrix, V, on v, 



Denote by Mg the 2A^-dimensional real vector formed 
from the qth column of My, and let the relations in 
Eqs. ([501) and ([52]) be defined for My- Then, from 
Eqs. (I52l), (VV),, = Efc^fci is |M2^-l|^ which is unity 
since Afy is an orthogonal matrix. To evaluate (V^V)ij 
for i 7^ J, it is helpful to rewrite Akij and Bkij in terms 
of Vi,-, 



kj 



' kj 



'Bkij — '^ki'^kj ~ '^ki'^kj 



so that 



kj 



A 



kij 



kj 

iBk, 



»(vSv^,-vLv^) 



(65) 
(66) 



(67) 



kij 



and 



N 



N 



k=l 



(68) 



which, due to the orthogonality of M, is zero when- 
ever i j. Therefore, (V^V)y — Sij, so that V is unitary. 

Similarly, if one considers the effect of the complex 
transformation VK, where K is the complex conjugation 
operation, acting on v. 



v' = VKv, 



(69) 



Vv, 



(61) 



where v' is defined analogously to v. By multiplying out 
the real and complex parts of this expression, it can be 
seen that the effect of V on v is equivalent to the action 
of the real 2A^-dimensional matrix. My, on Q, 



Q' = MyQ, 



(62) 



with 



/ ^11 



My = 



V 



11 



V 



11 



V 



IN 



V 



Nl 



Nl 



V 



Afl 



NN ^NN 



(63) 



where V^j and V[^. 
inary parts of V^j . 



are, respectively, the real and imag- 



If V,,- is chosen to be 



,■ expz(^y , 



then My becomes identical to M in the case where the 
non-zero T sub-matrices of M consist of scale-rotations. 

The orthogonality of My implies that V is unitary. To 
see this, consider 



(W)y = ^ y/akiakjt 



(64) 



one finds that this is equivalent to the action of the ma- 
trix M on Q in the case that the non-zero T sub-matrices 
that comprise M are scale-refiection-rotation matrices. 
Since V is unitary, the transformation VK is antiunitary. 

Thus far, we have shown only that the complex trans- 
formations V and VK satisfy Postulate 3.2 in the special 
cases of Q examined above. To show that these transfor- 
mations satisfy Postulate 3.2 for any state, note that the 
addition of xo to each of the Xi in the complex form of 
the state, v, generates the vector e^^°\/, that is 



V > V. 



(70) 



As a result, the vector v' in Eq. (|6ip transforms as 



v' ^ e'^^y', (71) 
and the vector v' in Eq. ([69]) transforms as 



(72) 



Since the P/ are independent of the overall phase of v', 
it follows that, in both Eqs. ^T^j and ([72]), the P- remain 
unchanged by the addition of xo to the Xi- Therefore, the 
transformations V and VK both satisfy Postulate 3.2. 
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3. Step 3: General Unitary and Antiunitary 
Transformations 

We have shown so far that the imposition of Postu- 
late 3.2 restricts M to a subset of the set of orthogo- 
nal transformations, and that each transformation in this 
subset can be recast as either a unitary or an antiunitary 
transformation. But, we have not ruled out the possibil- 
ity that there are unitary or antiunitary transformations 
which are not equivalent to orthogonal transformations 
satisfying Postulate 3.2. In this section, it shall be shown 
that, in fact, any TV-dimensional unitary or antiunitary 
transformation satisfies Postulates 3.1, 3.2, and 4. 

Consider the arbitrary unitary transformation U . The 
transformation 



But, since U is unitary. 



v' = Uv 

is equivalent to the transformation 
Q' = MQ, 



(73) 



(74) 



where 



M = 



,,R 



N 



N 



(tU 



NN 
I 

NN 



(75) 



with cr = 1. Similarly, using the arbitrary antiunitary 
transformation UK, one finds the corresponding matrix 
to be M with cr = — 1. 

First we show that M is an orthogonal matrix. In 
the following, denotes the real 2iV-dimensional vector 
formed from the qili column of M . 

M is an orthogonal matrix since: 

(a) the columns of M are normalized: 

|M2.„iP = from Eq. dill) 

N 

since U is unitary 



fc=i 
1 



(b) the columns of M are orthogonal: 

(i) Columns (2i — 1) and 2i, for i = 1,2, . . . , N, are 
orthogonal since, from Eq. (|75p . 



M2.-1 • M2. = 0. 



(77) 



(ii) By inspection of Eq. ([75]) . one sees that, for i ^ 



M2^-l ■ M2j„l = M2. • M2j 
M2,_i •M2j = •M2j_i. 



(78) 



N 



^ \Jl,Ukj = M2._l • M2j_l - iM2^-l ■ M 



2j 



fc=l 



(79) 



= 0, J. 
Therefore, for i ^ j, 

M2i-1 • M2j_l = M2. • M2j - 
M2^-l ■ M2j = -M2. • M2,-l = 



(80) 



Since M is an orthogonal matrix, it satisfies Postu- 
lates 3.1 and 4. The invariance of the P/ required by 
Postulate 3.2 follows from the observation that, under 
the addition of xo to the Xi in v. 



V > e -^''v. 



As a result, the vector v' in Eq. (I73p transforms as 



(81) 



(82) 



with (7 = ±1 depending upon whether a unitary or an- 
tiunitary transformation is chosen. In either case, since 
the P/ are independent of the overall phase of v', it fol- 
lows that the P/ remain invariant. 

Hence, any unitary or antiunitary transformation sat- 
isfies Postulates 3.1, 3.2, and 4. 



4- Step 4: Physical Transformations 

By Postulate 3.3, a physical transformation (such as a 
reflection-rotation of a frame of reference) that depends 
continuously upon a real-valued parameter n-tuple tt is 
represented by a map Al^r which depends continuously 
upon TT. From Eq. (|75p . the matrix M^, which repre- 
sents A^TTj contains the discrete parameter cr. Given two 
Af-matrices, M and M', with different values of a, it 
follows from Eq. ((75)) that it is only possible to contin- 
uously transform M into Af ' provided that AI can pass 
through the null matrix. However, M cannot be null 
since this would require that the Uij simultaneously van- 
ish, which is impossible since U is unitary. Therefore, it is 
not possible to continuously transform between two Af- 
matrices with different values of cr. Hence, the matrix Af^ 
has cr=lorcr = — 1 for all tt, which implies that the 
physical transformation under discussion is represented 
either by unitary (cr — 1) or antiunitary (cr — —1) trans- 
formations. 

Furthermore, by Postulate 3.3, a continuous physical 
transformation that depends continuously upon a real- 
valued parameter n-tuple tt is represented by a map Ai-r^ 
which reduces to the identity map for some value of tt. 
From Eq. ([75)) . we see that, for cr = 1, the matrix M 
consists of scale-rotation sub-matrices which, with a suit- 
able choice of the aij and the ipij , reduces to the identity. 
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However, with a = —1, it can be seen that a reduction 
to the identity is not possible. Therefore, a continuous 
physical transformation can only be represented by uni- 
tary transformations (a — 1). 

Finally, a discrete physical transformation (such as 
temporal inversion) is represented by a matrix M in 
which either (T = lorcr = — 1, and is therefore repre- 
sented by either a unitary or an antiunitary transforma- 
tion. 



lowed immediately before and after by suitable interac- 
tions. These interactions bring about continuous trans- 
formations of the system. From the results of the pre- 
vious section, these interactions must, therefore, be rep- 
resented by unitary transformations, which we shall de- 
note U and V, respectively (see Fig. [2]). In the following, 
we shall establish the form of these matrices, and then 
obtain an expression for the outcome probabilities for 
measurement A' performed on a system in state v. 



C. Representation of Measurements 

In the previous section, it has been shown that the 
state of a system at time t that has been prepared by a 
measurement in A can, from the point of view of a mea- 
surement A e be represented as the complex vector 



(83) 
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where the Pi are the outcome probabilities of measure- 
ment A if performed at time t. Furthermore, it has been 
shown that any interaction following the preparation can 
be represented by a unitary transformation of v. 

Consider an experiment where a system undergoes 
some measurement A G yields a particular out- 
come, and subsequently undergoes some other measure- 
ment h! ^ A that may or may not be the same as A. 
The purpose of this section is to develop the formalism 
necessary to predict the outcome probabilities in such an 
experiment. 



1. Prepared States 

Suppose that, in the above-mentioned experiment, a 
system undergoes measurement A and yields outcome j. 
What is the state of the prepared system? 

By Postulate 1.1, measurement A has N possible 
outcomes and, by the assumption of repetition consis- 
tency (Sec. Ill Al) . after A has been performed and out- 
come j obtained, immediate repetition yields the same 
outcome with certainty. Therefore, for every outcome j 
there exists a corresponding state, Vj, such that the mea- 
surement A upon the system in state Vj yields outcome j 
with certainty. From Eq. ([60)1 . since Pj — 1 and all the 



other Pj are zero, we have that 
V, = (0,...,e^'^^ 
where Xj is undetermined. 



.,0)T, 



(84) 



Uv 



Measurement 
A 



V 

->— 



Vv 



FIG. 2: A representation of a measurement of A'. A uni- 
tary transformation, U, transforms the input state, v, into Uv. 
Measurement A is performed on this state, and the output 
state, V, of the measurement is transformed by the unitary 
transformation V into Vv. 

First, from Postulate 1.1 and the assumption of repeti- 
tion consistency, there exist N states v'j^, Vj, . . . , v^ such 
that measurement A' performed on a system in state v^ 
yields outcome i with certainty. Hence, the arrangement 
in Fig. [5] must be such that A yields outcome i with cer- 
tainty when the input state to the arrangement is v^. For 
this to be the case, U must transform v^ to a state of the 
form v^e*^', where is arbitrary. That is, the matrix U 
must satisfy the relations 



Uv' 



v.e 



1,2, 



(85) 



Second, if outcome i is obtained from the arrange- 
ment, the output state of the arrangement must be of 
the form v^e'^>, where is arbitrary. But, immediately 
after measurement A, the system is in state up to 
an overall phase. Hence, the matrix V must satisfy the 
relations 



Vv, 



v'e*«^ 



= l,2,...,iV (86) 

From Eq. ([84]) . the form an orthonormal basis 
for C^, and, from Eq. jUl), v^ = U^v^e^^', which, since U 
is unitary, implies that the also form an orthonor- 
mal basis. Therefore, any state, v, can be expanded 
as X^j'^Wii with G C, and the matrices U and V are 
determined by the relations in Eqs. ([85]) and ([86|) up to 
the and the 

It is now possible to determine the outcome probabil- 
ities if a system in state v undergoes measurement A'. 
Using Eq. ([85|) and the expansion v = J^i cWij the first 
interaction of the arrangement transforms v into 



U 



(87) 



2. Measurements 

By Postulate 1.2, measurement A' can be represented 
by an arrangement consisting of a measurement A fol- 



The probability that measurement A in the arrange- 
ment yields outcome i is therefore |cjp. Hence, mea- 
surement A' performed on the state v yields outcome i 
with probability |c^p. 
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In summary, every measurement, A' € A, has an asso- 
ciated orthonormal basis, {v'^, Vj, . . . ,v'^}. Such a mea- 
surement can be implemented by a measurement A fol- 
lowed immediately before and after by interactions rep- 
resented by U and V defined in Eqs. (|85p and ([86]) in 
terms of these basis vectors. If measurement A' is per- 
formed upon a system in state v, the probability, P,[, of 
obtaining outcome i is |c^P, where is determined by 
the relation v — c-v-. 

3. Expected Values 

If the ith outcome of measurement A' has an asso- 
ciated real value aj, the expected value obtained in an 
experiment in which a system in state v undergoes mea- 
surement A' is defined as 

(A')=E«^^'- (88) 

i 

Since P*/ — |c-p and c- = v-^v, this expression can be also 
written as 

(A')=Evt (v:a^v:t)v 

i 

= v^A'v, 

where the matrix A' = X^i^i'^Wi^ Hermitian since 
the are real, and is non-degenerate since the a'^ have 
been assumed to be distinct fSec. Ill A"| . 

Since the are eigenvectors of A', with the a'^ being 
the corresponding eigenvalues, the matrix A' provides a 
compact mathematical way of representing all the rele- 
vant details about measurement A'. 



system admits d [d > 1) abstract quantum models with 
respect to d disjoint measurement sets, we shall speak of 
it as a composite system consisting of d sub-systems. 

One often prepares a state of a composite system by 
first preparing each of its subsystems, and then allowing 
these subsystems to interact with one another. In order 
to formally describe such a procedure, one needs a rule, 
the composite system rule, which we shall now derive, 
that enables the state of the system to be written down 
in terms of the states of its sub-systems. 



The Composite System Rule 

In order to derive the composite system rule, we 
shall apply Postulate 5 to the case of a composite sys- 
tem with two sub-systems with abstract models q(A^(^)) 
and q(iV(^'), respectively, where the composite system 
has the abstract model q(iV). 

Suppose that the sub-systems are in states represented 

as (-P/^^Xi^^) (^j- i xj^^ ) ; respectively. Then, by 
Postulate 5, the state of the composite system can be 
represented as (Py ; Xij ) , where 

P,^^p('^p(^^ (90) 
X.,=xf^+xf • (91) 
If we write the states of the sub-systems in complex form, 

and 



D. Composite Systems 

It is often the case that a given physical system can be 
subject to examination in distinct experimental set-ups, 
where, loosely speaking, the measurements in each set- 
up probe distinct properties of the system. Formally, we 
can express this as follows. 

Consider a system which admits abstract quantum 
model, q(iV(^^), with respect to measurement set A'^^\ 
and which admits abstract quantum model, q(iV(^'), 
with respect to measurement set , where the set- 
ups defined by measurement sets .4^^^ and A^^^ are dis- 
joint (in the sense defined in Sec. |TT|. The system can 
also be modeled as a whole. That is, we can construct 
the measurement set A — A^^'' x ^'^^ , and construct ab- 
stract quantum model c[{N), where N — N'-^^N^^\ We 
shall accordingly speak of the system as a composite sys- 
tem consisting of two sub-systems. More generally, if a 



respectively, and, similarly, write the state of the com- 
posite system as 

then it follows from Eqs. ((90l) and ((9T|) that v can simply 
be written as v'-'^^ (g) v*^^^. 

More generally, consider a composite system 
with d sub-systems, numbered l,2,...,d, in 
states v'-'^', v'^', . . . , v'^''\ respectively. We can re- 
gard sub-systems 1 and 2 as comprising a bipartite 
composite system, system 1', which, according to the 
above result, is in state v^^^ (g) v*^^^. Next, we can regard 
system 1' and sub-system 3 as comprising a bipartite 
composite system, system 2', which is therefore in 
state (v'^^^ (g) v^^^) (g) v^^^. Continuing in this way, we can 
see the state of the composite system with d sub-systems 
has the state v = v^^^ g) v^^) (g) • • • g) v^''). 
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E. Some Generalizations 

1. Representation of sub-system measurements 

Suppose that measurement A^^^ e A'--^\ represented 
by iV'^-'-dimensional Hermitian operator A*-^-*, with eigen- 

states Vj-^"* and eigenvalues a^, respectively, is performed 
on sub-system 1 of a bipartite composite system. With 
respect to the abstract quantum model q(A^) of the com- 
posite system, measurement A^^^ is not in the measure- 
ment set A of the composite system since the measure- 
ment has only N^-^"^ distinct outcomes whereas a measure- 
ment in A has N — N'^^^N^^^ > iV'^^ possible outcomes. 
However, it is convenient to be able to describe mea- 
surement A^^\ which we shall describe as a sub-system 
measurement, as an A^-dimensional operator A, in the 
framework of ci{N). 

To determine the form of A, it is sufficient to consider 
the effect of A on product states of the form v^^-* (g) v*^^^ 

of the composite system, where A'-^-'v^^'' = aiv[^\ If the 
composite system is in such a state, then sub-system 1 
is in state \/[^\ Therefore, when measurement A^^^ is 
performed, outcome Ui is obtained with certainty, and 
the state of sub-system 1 is unchanged (up to an irrele- 
vant overall phase). Therefore, the state of the compos- 
ite system remains unchanged. If we require that A has 
eigenvectors v^^-* (g) v(^\ with respective eigenvalues Oi, it 
follows that A can be taken to be A'"'^' (g) l*-^-*, where l'^-* is 
the identity matrix in the model of sub-system 2, with the 
only freedom being a physically irrelevant overall phase 
in each of the eigenstates of A^^\ 

The above result trivially generalizes to the case of a 
measurement performed on one sub-system of a compos- 
ite system consisting of d sub-systems. 



2. Degenerate measurements 

The model ci{N), whose explicit mathematical form 
has been derived above, applies to an abstract set-up 
where the measurements, chosen from the set A, have TV 
possible outcomes and therefore, by the distinctness as- 
sumption of Sec. Ill A| necessarily have N distinct out- 
come values. From the above discussions, it follows 
that each measurement A e ^ is represented by a non- 
degenerate Hermitian operator of dimension N. 

Now, it is useful to be able to describe measurements 
within the context of model q(-/V) which have fewer 
than N outcomes. An example of such measurements 
that we have discussed above are sub-system measure- 
ments. We shall now broaden the discussion to allow for 
measurements with N' < N possible outcomes where TV' 
is not a multiple of TV and which therefore cannot be 
regarded as sub-system measurements. 

Consider an abstract set-up where a preparation im- 
plemented using a measurement from A is followed by 



measurement A, whose observable outcome probabili- 
ties are denoted Pi, . . . ,Pn- Suppose that, if measure- 
ment B (with N' < N) possible outcomes) replaces mea- 
surement A, the outcome probabilities, Pl,...,Pjy, of 
measurement B can be determined from the Pi by a 
many-to-onc map of the outcomes of A to the outcomes 
of B. For example, in the case where N = 3 and N' ~ 2, 
the map from the outcomes of A to the outcomes of B 
might consist in 1 ^ 1', 2 2' and 3 — > 2', in which 
case P{ = Pi and P2 = P2 + Ps- In such a case, we shall 
say that measurement B is a degenerate form of mea- 
surement A; or, more simply, that measurement B is a 
degenerate measurement. 

Now, measurement B can formally be treated as if it 
has N possible outcomes, but where some of these out- 
comes have the same value. In this mode of description, 
in the above example, one can maintain a one-to-one map 
between the outcomes of A and of B (so that 1 1', 
2^2' and so on), but label the outcomes of B with 
their outcome values, and, when computing the outcome 
probabilities of B, group together the outcomes with the 
same outcome value. In the above example, one would 
respectively label the three outcomes with outcome val- 
ues 61 , 62 and 63 , and but have 62 — b^. 

Since measurement B is a degenerate form of measure- 
ment A, it can be represented by the A'^-dimensional de- 
generate Hermitian operator B = 6iVivJ, where Av^ = 
ttiVi. The outcome probabilities for measurement B can 
then be computed in the usual way, on the understanding 
that those outcomes with the same outcome values must 
not be regarded as physically distinguishable, but must 
be grouped as just described. 

Conversely, in an abstract set-up where A contains 
measurements represented by all possible non-degenerate 
Hermitian operators, a degenerate Hermitian operator 
can be regarded as representing a measurement which is 
a degenerate form of some measurement in A. 

V. DISCUSSION 
A. General discussion of the Formulation 

Above, we have formulated a set of background as- 
sumptions {partitioning, time, and states), an abstract 
experimental set up, and a set of postulates, from which 
we have shown that it is possible to derive the finite- 
dimensional abstract quantum formalism (apart from the 
explicit form of the temporal evolution operator, which 
will be derived in Paper II). 

As described earlier, the background assumptions and 
the postulates have been formulated as far as possible 
so that they possess the properties of transparency and 
traceability. The background assumptions and a num- 
ber of the postulates (Postulates 3, 3.1, 3.3) are drawn 
unchanged from the framework of classical physics, and 
most of the remaining postulates are drawn from the 
framework of classical physics but modified in light 
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of experimental facts (Postulate 1.1), or are based on 
a classical-quantum correspondence argument (Postu- 
lates 2.1, 2.4, 3.2, 3.4, 5). Hence, the majority of the 
background assumptions and postulates can be traced to 
facts or principles that are, or can be, well grounded or 
reasonably well grounded in experimental facts or in our 
theoretical intuition. 

Of the remaining, novel postulates (Postu- 
lates 1.2, 2.2, 2.3, 4), Postulate 1.2 is a direct gen- 
eralization of experimental facts, and Postulate 4 is 
a reasonable consistency principle. Postulates 2.2 
and 2.3 are both transparent in that they can be clearly 
understood as assertions about the physical world, and 
Postulate 2.3 is traceable to a plausible theoretical 
principle. Furthermore, since Postulates 2.2 and 2.3, in 
conjunction with the above-mentioned postulates, give 
rise to the abstract quantum formalism, there is good 
reason to believe that they are valid. Nevertheless, these 
two postulates, particularly Postulate 2.2, are less well 
grounded in our theoretical intuition than the others, 
and since they play such a key role in the emergence of 
the quantum formalism, they shall be discussed further 
below. 

We mention briefly that it is also possible to under- 
stand some of the postulates using concepts that have 
not been mentioned thus far. For example. Postulate 2.1 
implies that, when a measurement is performed on a sys- 
tem, there are degrees of freedom in the state of a sys- 
tem about which no information is gained. Hence, Pos- 
tulate 2.1 can be regarded as a concrete expression of 
Bohr's principle of complementarity. Consequently, it is 
possible for different measurements in the measurement 
set, A, to be inequivalent in that they yield inequiva- 
lent information about the state of the system. If one 
accordingly regards measurements in A as providing dis- 
tinct, inequivalent points of view of a physical system, 
then two questions arise which do not arise in classical 
physics, namely (a) how should one theoretically repre- 
sent these different measurements, and (b) whether some 
measurements yield more information about the state of 
a system than other measurements. Postulate 1.2 an- 
swers the first question by asserting that it is possible to 
represent all measurements in A in terms of any given 
measurement in A and appropriately chosen interactions 
in the interaction set, T. Postulate 2.3 answers the second 
question with the assertion that none of these points of 
view are privileged insofar as the amount of information 
they yield about the system, which can be regarded as a 
kind of principle of relativity applied to the perspectives 
provided by the different measurements in A. 

The derivation itself is noteworthy in several respects. 
First, it gives rise to a mathematical structure that is 
neither more nor less general than the finite-dimensional 
abstract quantum formalism. Therefore, any change to 
the formalism would require a modification of the pos- 
tulates or background assumptions. Consequently, as we 
shall illustrate below, the derivation provides an excel- 
lent 'laboratory' for investigating proposed modifications 



of the quantum formalism. 

Second, the derivation yields the conclusion that phys- 
ical transformations are represented either by unitary or 
antiunitary transformations. This is a rather remarkable, 
unanticipated feature of the derivation since antiunitary 
transformations are not generally regarded as an inte- 
gral part of the abstract quantum formalism (as formal- 
ized, for instance, by Dirac or von Neumann), but are 
instead usually introduced by reference to the theorem 
of Wigner J^j mentioned in the Introduction. In addi- 
tion, we note that antiunitary transformations have not 
been obtained in any of the recent attempts to derive the 
quantum formalism in which a significant fraction of the 
quantum formahsm is obtained [ll|, [H, IHIli, [H, [13, ill . 
Furthermore, since unitary and antiunitary transforma- 
tions emerge simultaneously in the above derivation, the 
derivation suggests that antiunitary transformations are, 
in fact, an integral part of the quantum formalism. 

Third, the derivation shows that the use of complex 
numbers in the quantum formalism is directly connected 
with the fact that the set of possible physical transfor- 
mations can be represented by the set of all unitary or 
antiunitary transformations of a suitably defined complex 
vector space. Specifically, the complex form of the quan- 
tum state and the (anti)unitarity of physical transforma- 
tions arise simultaneously as a result of imposing Postu- 
late 3.2 which, in turn, is based on the simple idea that 
a change in the overall value of the Si in the Hamilton- 
Jacobi model has no physically observable consequences. 
Hence, the derivation significantly elucidates the use of 
complex numbers in the quantum formalism. 

Fourth, it is apparent from the derivation that the con- 
cept of information plays a substantial role in giving rise 
to the quantum formalism. The information gain condi- 
tion directly leads to Q-space, which introduces square- 
roots of probability, or real amplitudes and, via Postu- 
late 2.3, leads to a 2iV-dimensional Q-space. Further- 
more, in conjunction with Postulate 2.4, Postulate 2.3 
leads to the function f{xi) — ± cos{axi + b). Hence, the 
sinusoidal functions into which the phases in a quantum 
state enter can be directly traced to the concept of in- 
formation. Finally, the prior over the unit hypersphere 
in Q^^-space induced by the imposition of Postulate 2.3 
leads, via Postulate 4, to the strong constraint that phys- 
ical transformations can only be represented by orthogo- 
nal transformations of the unit hypersphere. 

Fifth, the formulation highlights the physical impor- 
tance of the notion of a prior over a continuous parame- 
ter. The notion plays a key role in the derivation, enter- 
ing through the definition of the Shannon- Jaynes entropy 
and through Postulate 2.4. This is noteworthy since the 
notion of prior appears to be underappreciated, occur- 
ring rather infrequently in discussions of the probabilis- 
tic aspects of quantum theory, and not occurring in most 
of the aforementioned deductive approaches to quantum 
theory (the approach due to Caticha [III [Hi being the 
only exception). 

Sixth, from the perspective provided by the deriva- 
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tion, one can see rather clearly which assumptions quan- 
tum theory shares with classical physics, which assump- 
tions are modifications of classical ideas in light of experi- 
mental facts, which assumptions are drawn from classical 
physics using a correspondence argument, and which are 
novel insofar as they have no classical counterparts. In 
particular, one can see that the new ideas that need to be 
introduced beyond those familiar from classical physics in 
order to obtain the quantum formalism all arise from the 
concepts of probability, information, or from classical- 
quantum correspondence arguments. Since ideas con- 
cerning probability and correspondence played an im- 
portant role in the historical development of quantum 
theory and in its interpretation in the years immediately 
following its formulation, the concept of information is 
the obvious new addition. 



1. Discussion of Postulate 2.2. 

Postulate 2.2 introduces the assumption that, when a 
measurement is performed on a physical system, there 
are outcomes (which we have labeled a and b, and + 
and — ) that are objectively realized, but go unobserved 
by the experimenter. 

The apparently successful derivation of the quantum 
formalism lends support to the plausibility of the assump- 
tion that a measurement generates unobserved outcomes. 
As mentioned above, the assumption also has the bene- 
fit of transparency. Nevertheless, it raises two natural 
questions, namely (i) to what physical property or prop- 
erties should the outcomes a and 5, and -I- and — be 
attributed, (ii) why are these outcomes not observed in 
standard experiments. A preliminary response to these 
questions is as follows. 

First, by examining the quantum model of a struc- 
tureless particle in the classical limit (as m tends to 
macroscopic values), we have seen that, for a system in 
an eigenstate of energy, the variable Xi in the quantum 
model corresponds to Si in the discretized form of the 
classical Hamilton- Jacobi model. Now, the Si encode the 
local momenta and total energy of the system. Hence, if 
a position measurement is performed and yields the ob- 
served outcome i, then we can associate the outcomes a, b 
and +, — with the local momenta and the total energy of 
the system. 

More generally, if a measurement A is performed on a 
system, it seems reasonable to associate the outcomes a, b 
and — with the property A' , which is complementary 
to property A, and with the total energy, E, of the sys- 
tem. We shall say that property A' is complementary to 
the property A measured by A in the sense that exact 
knowledge of the properties A and A' suffice to determine 
the classical state of the system. 

Second, the unobservability of the outcomes a, b 
and may be roughly understood as follows. We 

shall see in Paper H that, for a system in an eigen- 
state of energy E, the overall phase, of its quan- 



tum state (in the complex representation) changes at 
the rate —E/h. A measurement which is able to re- 
solve the outcomes a, b and — must therefore have 
a temporal resolution At < h/E. Now, according to 
the energy-time uncertainty relation AEAt > h/2 [53 |. 
the energy associated with the interaction used to im- 
plement the measurement has uncertainty AE > ^h/At, 
so that AE > E/2. From E = mc^, it then follows 
that AE must be of the order of the rest energy of the 
system. A measurement of such energy would therefore 
probably not preserve the identity of the system, thereby 
violating the assumption that interactions preserve the 
identity of the system (see Sec. IH A[) . Hence, a mea- 
surement with the requisite temporal resolution cannot 
be consistently described within the quantum formalism. 
Conversely, a measurement that, with high probability, 
preserves the identity of the system, will have insufficient 
temporal resolution to resolve the outcomes a, b and + , — . 

2. Discussion of Postulate 2.3 

The information gain condition plays a key role in the 
above derivation via Postulate 2.3. In order to obtain a 
clearer understanding of the condition, it is helpful to ask 
whether it resembles, or is equivalent to, other informa- 
tional principles, or has other consequences which coin- 
cide with well-known results. Below, we shall outline two 
of the consequences which are in agreement with results 
that are well-known in probability theory and statistics, 
and shall outline the connections to two other informa- 
tional principles that have been proposed in the context 
of recent informational approaches to quantum theory. 

First, we have shown elsewhere [1^ that the assump- 
tion that the information gain condition applies to a 
probabilistic source is equivalent to Jeffreys' rule [sot , 
a general rule for the assignment of prior probabilities 
which was first suggested in the context of probability 
theory. This rule is widely used in some areas (in econo- 
metrics, for example), and yields priors for parameterized 
probability distributions (such as for the mean and stan- 
dard deviation of a Gaussian distribution) that are in 
agreement with the results of other, independent lines of 
argument (see [Slj, for example). 

We also note that the metric ds^ — J^i dQi, introduced 
in Sec. lIV ATI provides a natural measure of the distance 
between probability distributions, and is equivalent, up 
to an irrelevant multiplicative constant, to the Fisher 
metric, dsj^ = J^i'^^i / which measures the distance 
between the probability distributions P and P + SP. 

Second, we note that the Fisher metric was obtained 
in [s^] as a natural measure of the distance between prob- 
ability distributions, where it was connected with the 
Hilbert space distance between pure states. The Fisher 
metric also gives rise to the so-called Fisher information 
of a continuous probability distribution, which is cen- 
tral to the Fisher information approach to understanding 
quantum theory [H, [s^l . 
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Finally, we note that, if the information gain condition 
applies to a probabilistic source with some probability n- 
tuple, P, it follows that, in n interrogations of the source, 
the amount of Shannon-Jaynes information provided by 
the data about P is an increasing function of n in the 
limit as n — s- oo. This condition, which we shall call 
the condition of information increase, accords with the 
rather simple and intuitively plausible idea that, as one 
gathers more data from a probabilistic source, one's in- 
formation about P strictly increases. This condition was 
first proposed, in a slightly different form, in where 
it forms the basis for an attempt to derive a part of the 
quantum formalism. 

Hence, it appears that the information gain condition 
has a number of interesting and important connections to 
results in probability theory and to principles in various 
informational approaches to quantum theory. 



B. Some Implications of the Deduction 

1. Information in Quantum Theory 

One of the major objectives of the programme of deriv- 
ing quantum theory using the concept of information is 
to determine whether the concept of information is indis- 
pensable to our understanding of the quantum formalism, 
and, if so, to illuminate the precise relationship between 
the concept of information and the quantum formalism. 

On the first issue, although many recent approaches 
to derive the quantum formalism involve the concept of 
information, the conclusion that information is indispens- 
able to our understanding of the quantum formalism can- 
not be drawn, either because the approaches are unable 
to obtain the quantum formalism (even though they are 
able to derive specific results, such as Mains' law), or be- 
cause, in those approaches that are able to obtain a sig- 
nificant fraction of the quantum formalism, the abstract 
nature of some of the assumptions that are employed ob- 
scures the role played by information in determining the 
formalism. Indeed, further doubt on the need for infor- 
mation is cast by other recent approaches, most notably 
due to Hardy [13, HI] , that are successful in deriving a 
significant fraction of the quantum formalism without in- 
voking the concept of information in any way. 

On the second issue, it is remarkable that the manner 
in which the concept of information is formalized dif- 
fers considerably amongst the various informational ap- 
proaches. Consequently, as we shall elaborate upon be- 
low, the question of precisely how one should formalize 
the concept of information in the quantum setting has re- 
ceived a wide range of often incompatible answers. How- 
ever, it is difficult to evaluate the relative merits of these 
answers, for the same reasons just given above, namely 
either because the approaches are too incomplete or be- 
cause they use abstract assumptions that obscure the role 
played by information. 



The formulation presented here provides significant 
new insight into both of these issues. First, the for- 
mulation rests on assumptions that are transparent and 
that are, to a large extent, traceable to familiar or well- 
established experimental facts or theoretical ideas. For 
example, abstract assumptions that directly introduce 
complex numbers are avoided. As a result, the role played 
by information in the derivation can be clearly seen, and 
its role is sufficiently widespread that it seems very likely 
that the concept of information could indeed have a fun- 
damental role to play in our understanding of the origin 
of the quantum formalism. 

In order to discuss the second issue, it is convenient 
to classify the above-mentioned differences in the for- 
malization of the concept of information with respect to 
(a) what the information is about, (b) whether or not 
information is quantified in some way, (c) which informa- 
tion measure is chosen, and (d) when the Shannon-Jaynes 
measure is used, whether there is a naturally preferred 
prior, and, if so, what is the form of the prior. 

In particular, with respect to (a), in f^, information 
gain is, as in our approach, regarded as the gain of infor- 
mation about the state of the system due to the receipt 
of data obtained through performing a measurement on 
the system. In contrast, in 10], information gain is taken 
to be the removal of the uncertainty of the experimenter 
about the outcome of a measurement as a result of the 
measurement being performed. In respect to (b), one 
finds that, for example, in [3, [isj . information is not 
subject to quantification, whereas in 0, [lo], a particu- 
lar quantification measure is employed. 

With respect to (c), the Shannon-Jaynes entropy is 
used in [3], whereas [1^ employs a measure that differs 
from the Shannon entropy, it being argued that the Shan- 
non entropy is inapplicable in the quantum setting 
Finally, with respect to (d), some authors [3|] appear to 
hold the view that there is no natural basis for determin- 
ing a prior for the Shannon-Jaynes entropy, while, in the 
field of probability theory, authors who have sought plau- 
sible general principles for the assignment of priors have 
obtained different priors over probability n-tuples (for ex- 
ample, see [13, HH) on the basis of their arguments. 

The approach described here supports the view that 
information is primarily to be regarded as information 
gained about the state of a system by an experimenter 
as a result of performing measurements on the system. In 
addition, the approach demonstrates the importance of 
information quantification, and provides significant sup- 
port for the view that the Shannon-Jaynes entropy is the 
appropriate information measure in a quantum setting. 

Finally, we have shown that, for an experimenter who 
receives a system prepared in a pure but unknown state, 
it is possible to formalism an intuitively plausible princi- 
ple (Postulate 2.3) which determines the prior for the 
probabilistic source that models a measurement per- 
formed on the system by the experimenter. As described 
in Sec. Ill Al one can see that the experimenter's state 
of knowledge in this case is not arbitrarily chosen, but 
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precisely reflects the knowledge that a system has been 
prepared in such a way that its pre-preparation history is 
irrelevant insofar as the outcomes of subsequent measure- 
ments in the set-up are concerned (a preparation which is 
analogous to an idealized complete preparation in classi- 
cal physics) and therefore has fundamental physical sig- 
nificance. 



2. Interpretation and Modification of Quantum Theory 

The deductive formulation has several implications for 
some issues of concern in the interpretation of quantum 
theory, and for some of the proposed modifications of 
quantum theory. We shall briefly outline one example. 

Modification of the Quantum Formalism. Since the 
development of the quantum formalism, there has been 
some uncertainty as to whether the formalism is the most 
general formalism for the description of quantum phe- 
nomena. Various possibilities have been suggested for 
the generalization of the formalism which, from a purely 
mathematical point of view, seem to be plausible, and 
which may have interesting physical consequences. For 
example, the possibility of non-unitary tem por al evolu- 
tion has been considered by several authors [35l [s^ . |37| . 

In some cases, it is possible to devise experimental 
tests to rule out certain types of modiflcation on physi- 
cal grounds. However, it is not always possible to devise 
such tests or to implement them. The deductive formu- 
lation described here provides another way in which the 
physical plausibility of a proposed modification may be 
assessed. 

The deductive formulation shows that a set of postu- 
lates implies the existing quantum formalism. Hence, if 
any proposed modification of the formalism is to be valid, 
one or more of these postulates must be changed in some 
way. By tracing the dependency of the features of the 
quantum formalism that are at issue to specific postu- 
lates, and assessing the consequences of modifying one 
or more of these postulates, one can potentially use the 
deductive formulation to obtain another indication as to 
whether a proposed modification is physically plausible. 
Furthermore, the formulation has the potential to allow 
one to explicitly work out the effect that specific changes 
to particular postulates would have upon the quantum 
formalism. 

For example, for the purpose of illustrating how the 
deductive formulation can help guide modifications to 
quantum formalism, suppose that one wishes to modify 
the quantum formalism so as to allow continuous trans- 
formations to be represented by non-unitary transfor- 
mations. Now, in the deductive formulation, unitarity 
depends most directly upon Postulate 3.2 (Invariance) , 
and additionally depends upon several supporting postu- 
lates which are based on classical physics, on probabilistic 
ideas, or on novel assumptions. The proposed modifica- 
tion implies that one or more of these postulates needs 
to be modified. 



Amongst the supporting postulates, all but Postu- 
late 2.2 have a reasonably high degree of certainty. How- 
ever, it does not appear to be possible to modify Pos- 
tulate 2.2 in any plausible manner so as to give rise to 
non-unitary transformations. The most likely candidate 
for modiflcation therefore appears to be Postulates 3.2. 

Consider the extreme case where the constraint im- 
posed by Postulate 3.2 is entirely removed. Then, the 
set of possible transformations consists of the set of or- 
thogonal transformations of the unit hypersphere in Q^^ . 
When expressed in complex form, this set of transforma- 
tions contains transformations that are neither unitary 
nor antiunitary. Thus, a simple modification of the pos- 
tulates readily yields a set of non- unitary transformations 
which can then be subjected to further examination to 
assess their physical significance and plausibility. 

VI. CONCLUSION 

In this paper, we have shown that majority of the 
finite-dimensional abstract quantum formalism can be 
derived from a set of physically comprehensible assump- 
tions. The derivation illuminates the physical origin of 
the quantum formalism and the role played by informa- 
tion in quantum theory, makes clearer the commonalities 
and differences in the assumptions underlying quantum 
physics and classical physics, and potentially has signif- 
icant implications for the interpretation and proposed 
modifications of quantum theory. 
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APPENDIX A: IMPLEMENTATION OF THE 
INFORMATION GAIN CONDITION 

In this appendix, we shall more formally implement the 
information gain condition (Sec. IIV A ip in the general 
case of an M-outcome probabilistic source. 

First, we parameterize the n-tuple P by the (M — 1)- 
dimensional parameter n-tuple A = (Ai, A2, . . . , Am-i), 
so that P = P{\), where the parametrization is invert- 
ible and differentiable, and then set the prior probabil- 
ity, Pr(A|I), equal to a constant. 

Next, we determine Pr(A|/, n, I). From Bayes' theo- 
rem, the posterior probability is given by 



Pr(A|/,n,I) 



Pr(/|A,n, /) Pr(A|n,I) 



/ • • • / Pr(/|A, n, /) Pr(A|n, I) dAi . . . dAM-i 
Pr(/lA,n,I) 



/••• JPr{f\X,n,l)dXi... dXM- 



(Al) 



Here, we have used the fact that Pr(A|n, I) — Pr(A|I). 
This follows from an application of Bayes' theorem, 
Pr(A|n, I) Pr(n|I) = Pr(n|A, I) Pr(A|I), and the fact that n 
is chosen freely by the experimenter and therefore can- 
not depend upon A. Hence, the posterior probability is 



proportional to the likelihood, Pr(/|A, n,I). 

When n is large, using Stirling's approximation, n\ = 
n"(27rn)i/2e-" + 0(l/n), the likelihood (Eq. jS])) be- 
comes 
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In the limit of large n, the posterior, Pr(A|/,n,I) is 
sharply peaked about A^"^ defined by / = P{>S°'>). To 
find the form of the posterior about A'*'' , we expand the 
likelihood about A*-"-*. We write 
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where the In term has been expanded out and we have order terms in the A; , the likelihood becomes 
used the fact that J^i Pi = ^- Retaining only the leading 
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where 



1 



M 



A(0) 



A(0) 



The posterior can then be obtained from Eq. 
example, in the case where M — 2, 
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and, more generally, 
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where Bui — Xjaly. 

Now, consider an Af -dimensional real Euclidean 
space, Q^^ 1 with axes Qi, Q2, ■ • ■ , Qm- If we define the 
vector Q = (Qi, (32, ■ • ■ , Qm) such that Qi = ^JYi (0 < 
Qi "Si 1), then every Q that represents a probability n- 
tuple lies on the positive orthant, 5*^"^, of the unit hy- 
persphere, 5*^"^. Eq. (|A6|) can be then rewritten as 

1 ^ ^9Q, 



A(0) 
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For example, in the case where M = 2, 
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where ds^ = dQ^ + dQ\ is the metric in . The poste- 
rior, Pr(Ai|/, n, /), is therefore a Gaussian with standard 
deviation. 



1 / ds 



\d\\ 
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where s is the distance along the positive quadrant of the 
unit circle. Since Pr(Ai|I) is constant. 
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where the relation Pr(Ai|I)|dAi| = Pr(s|I)|ds| has been 
used to arrive at the second line. Independence of Ai^T 
from /i can be ensured if and only if Pr(s|I) at A^^'' is a 
constant on S^~^ ^ where the constant is non-zero in or- 
der to ensure that the parametrization of P is invertible. 
In this case. 
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Since we assumed at the outset that Pr(Ai |I) is a con- 
it follows from the relation 
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that s(Ai) = a\\ + 6, where a, 6 are arbitrary real con- 
stants. From Eq. (jAlip . it then follows that a = l/2a-y/n, 
and, from Eq. (|A14[) . it then follows that the posterior 
over the positive quadrant of the unit circle is a Gaussian 
whose standard deviation is 1/2-01, which is independent 
of Q. 

The treatment for general M runs parallel to the 
above. Suppose that the A/ are chosen such that infinites- 
imal changes in the Aj generate orthogonal displacements 
in Q^-space. This can be done by using hyperspher- 
ical co-ordinates, {r,9i,92, ■ ■ ■ ,Om-i), with r — 1 and, 
for ^ = 1, . . . , M — 1, with 0i being a function of A; only. 
In that case, one finds that 



1 



/ ds 



A(0) 



(A15) 



Consequently, the posterior probability (Eq. 
duces to a product of Gaussian functions. 



Pr(A|/,n,I)= [] 



1 



1 = 1 



exp 



and the information gain becomes 



(A16) 



M-l 

ln(o-;;\/27re) 



N - 1 



M - 1 



^ln[Pr(Ai,A2,... 
2n 



ds_ 

Aa/-i|I)] 



■In 



ln[Pr(si,S2, 



dQl 



dQl, 



,SM-l|I)] , 

(A17) 

and where dsi — 



where ds^ = dQi 
{ds/dXi)\^(o)dXi. 

Since the A; are independent variables, independence 
of AK from the A; can be ensured if and only if the 
prior Pr(si, S2, ■ ■ ■ , sm_i|I) is a constant on S:^~^ inde- 
pendent of the A;, in which case 



A A/-1 /2n 

AK = In — 

2 VTre 



const. 



(A18) 
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Therefore, any area element, dA = YifLi^ d,si, on S^^~^ 
is weighted proportionally to its area independent of its 
location on the unit hypersphere. Hence, the informa- 
tion gain condition is equivalent to the condition that 
the prior over S:f^^^^ is uniform. 

From the constancy of Pr(si, S2, . . . , sa/-i|I) derived 
above, it follows that Pr(si |I), Pr(s2|I), . . . , Pr(sM-i |I) 
are all constant. Similarly, from the constancy 
of Pr(Ai, A2, . . . , AAf_i|I), which we assumed at the out- 
set, follows the constancy of the Pr(A/|I). From the re- 
lations Pr(A,|I)dA/ =Pr(s;|I)dsz (Z = 1, 2, . . . , A/ - 1), it 
then follows that 

si=aiXi+bi, (A19) 



where the a/ and 6/ are arbitrary constants. From 
Eq. (|A15|) . we obtain that 



o-w = 1= Si^i>, 

2aiy/n 



which, using Eq. (IA19p . implies that the posterior 
over S:^~^ is a symmetric Gaussian function whose stan- 
dard deviation is l/2y/n, independent of Q. 
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[43] Specifically, (a) in :13], it is assumed that a physi- 
cal theory can be accommodated within a C*-algebraic 
framework, which employs the complex number field, 
and (b) Grinbaum's Axiom VII [l^ makes specific as- 
sumptions regarding the applicable number fields. 

[44] The formal rules of quantum theory can be catego- 
rized as follows: (i) Operator Rules: the rules for writing 
down operators representing measurements that, from 
a classical viewpoint, are measurements of functions of 
other observables, (ii) Commutation Relations: the com- 
mutation relationships for measurement operators, for 
example those operators representing measurements of 
position, momentum, and components of angular mo- 
mentum, (iii) Transformation Operators: explicit forms 
of the operators that represent symmetry transforma- 
tions (such as displacement) of a frame of reference, and 
(iv) Measurement- Transformation Relations: the rela- 
tions between measurement operators and the operators 
representing passive transformations between physically 
equivalent reference frames. 

[45] The background environment of a systems is, by defi- 
nition, that part of the environment of a system which 
non-trivially influences the behavior of the system, but 
which is not reciprocally affected by the system. For ex- 
ample, if a planet in the gravitational field of a star is 
modeled as a test particle in a fixed gravitational field 
of the star, then the planet (test particle) is the system, 
and the gravitational field is its background. If a part 
of the environment is reciprocally affected by the sys- 
tem, the system is enlarged to include this part of the 
environment. For example, if the reciprocal affect of the 
planet on the star is relevant, the system is enlarged to 
include the star, and the star and planet are regarded as 
interacting sub-systems within the enlarged system. 

[46] The use of the word 'property' should be understood 
loosely here: for example, one can, in both classical and 
quantum physics, speak of the spatial and spin properties 
of a particle with spin. 

[47] A probabilistic source is a black box which, upon each 
interrogation yields one of a given number of outcomes 
with a given probability. 



[48] Here and subsequently, it is assumed that all interactions 
with the system preserve the identity of the system. 

[49] As will be shown in Sec. lIV ATI the modeling process can 
be formalized using standard methods of Bayesian data 
analysis. See [s^, for example, for a general discussion 
on the subject. 

[50] One can construct procedures which, for example, clas- 
sify a particle as being in one of a discrete (finite or count- 
ably infinite) number of regions of space, but, although 
one might describe such a procedure as a 'measurement', 
it is not regarded a fundamental measurement in the clas- 
sical framework. 

[51] The Shannon entropy, H{Pi,...,Pm) = -Ei -Pi In -Pi 
leads, via a straightforward continuum limit argu- 
ment [2^ to the Shannon- Jaynes entropy, H[p{x)] — 
— f p{x)\n{p{x)/ij,{x)) dx, of a probability density func- 
tion p{x), where /x(x) is a measure over x. If the Shannon- 
Jaynes entropy is used in the principle of maximum en- 
tropy, then, in the absence of any data, the principle 
leads to the assignment p{x) — I-l{x), which leads to the 
interpretation that i-i{x) is the prior probability, Pr(a;|I), 
where I symbolizes one's knowledge prior to obtaining the 
data (see [i^], § 12.3). The functional — J p{x)lnp{x) dx 
is often quoted as the continuum generalization of the 
Shannon entropy, and indeed was stated (without proof) 
by Shannon in his foundational paper [2j|. However, a 
careful argument shows that the correct continuum form 
is the Shannon- Jaynes entropy. The KuUback-Leibler dis- 
tance (or the relative entropy) has the same form as the 
Shannon-Jaynes entropy, but is generally not accompa- 
nied by the interpretation of /i(x) as the measure or prior 
over X. 

[52] We shall regard AEAt > h/2 as being a consequence of 
the classical result AuiAt > 1/2 (relating the uncertainty 
in the duration and angular frequency of a wave) and 
the photon energy-frequency relationship E — hcu. How- 
ever, the validity and meaning of the energy-time uncer- 
tainty relation, and of the inferences that can legitimately 
drawn from it, have been, and continue to be, the sub- 
ject of debate (see, for example [4ll].i; 12.8, and [i^). The 
argument given in the text leading to AE > E/2 should, 
accordingly, only be regarded as suggestive insofar as it 
relies on a particular interpretation of the energy-time 
uncertainty relation. 



